本周,我一直在开发一个名为 codeshift 的命令行工具,它可以让用户输入源代码文件,选择编程语言,并将其翻译成他们选择的语言。
幕后并没有什么花哨的东西 - 它只是使用名为 Groq 的 AI 提供商来处理翻译 - 但我想了解开发过程、它的使用方式以及它提供的功能。
将源代码文件转换为任何语言的命令行工具。
codeshift [-o ]
codeshift -o index.go go Examples/index.js
codeshift [-o ]
例如,要将文件examples/index.js 翻译为Go 并将输出保存到index.go:
codeshift -o index.go go Examples/index.js
我一直致力于这个项目,作为安大略省多伦多塞内卡理工学院开源开发主题课程的一部分。一开始,我想坚持使用我熟悉的技术,但该项目的说明鼓励我们学习新的东西,比如新的编程语言或新的运行时。
虽然我一直想学习 Java,但在网上做了一些研究后,它似乎不是开发 CLI 工具或与 AI 模型交互的最佳选择。它没有得到 OpenAI 的正式支持,并且其文档中的社区库已被弃用。
我一直坚持使用流行技术 - 它们往往很可靠,并且拥有完整的文档和大量在线信息。但这一次,我决定采取不同的做法。我决定使用 Bun,这是一个很酷的新 JavaScript 运行时,旨在取代 Node。
事实证明我应该坚持我的直觉。我在尝试编译我的项目时遇到了麻烦,我所能做的就是希望开发人员能够解决这个问题。
之前在这里引用过,未解决就关闭:https://github.com/openai/openai-node/issues/903
这是一个相当大的问题,因为它会阻止在使用最新的 Sentry 监控包时使用 SDK。
import * as Sentry from '@sentry/node';
// Start Sentry
Sentry.init({
dsn: "https://your-sentry-url",
environment: "your-env",
tracesSampleRate: 1.0, // Capture 100% of the transactions
});
const params = {
model: model,
stream: true,
stream_options: {
include_usage: true
},
messages
};
const completion = await openai.chat.completions.create(params);
Results in error:
TypeError: getDefaultAgent is not a function at OpenAI.buildRequest (file:///my-project/node_modules/openai/core.mjs:208:66) at OpenAI.makeRequest (file:///my-project/node_modules/openai/core.mjs:279:44)
(Included)
All operating systems (macOS, Linux)
v20.10.0
v4.56.0
This turned me away from Bun. I'd found out from our professor we were going to compile an executable later in the course, and I did not want to deal with Bun's problems down the line.
So, I switched to Node. It was painful going from Bun's easy-to-use built-in APIs to having to learn how to use commander for Node. But at least it wouldn't crash.
I had previous experience working with AI models through code thanks to my co-op, but I was unfamiliar with creating a command-line tool. Configuring the options and arguments turned out to be the most time-consuming aspect of the project.
Apart from the core feature we chose for each of our projects - mine being code translation - we were asked to implement any two additional features. One of the features I chose to implement was to save output to a specified file. Currently, I'm not sure this feature is that useful, since you could just redirect the output to a file, but in the future I want to use it to extract the code from the response to the file, and include the AI's rationale behind the translation in the full response to stdout. Writing this feature also helped me learn about global and command-based options using commander.js. Since there was only one command (run) and it was the default, I wanted the option to show up in the default help menu, not when you specifically typed codeshift help run, so I had to learn to implement it as a global option.
I also ended up "accidentally" implementing the feature for streaming the response to stdout. I was at first scared away from streaming, because it sounded too difficult. But later, when I was trying to read the input files, I figured reading large files in chunks would be more efficient. I realized I'd already implemented streaming in my previous C++ courses, and figuring it wouldn't be too bad, I got to work.
Then, halfway through my implementation I realized I'd have to send the whole file at once to the AI regardless.
But this encouraged me to try streaming the output from the AI. So I hopped on MDN and started reading about ReadableStreams and messing around with ReadableStreamDefaultReader.read() for what felt like an hour - only to scroll down the AI provider's documentation and realize all I had to do was add stream: true to my request.
Either way, I may have taken the scenic route but I ended up implementing streaming.
Right now, the program parses each source file individually, with no shared context. So if a file references another, it wouldn't be reflected in the output. I'd like to enable it to have that context eventually. Like I mentioned, another feature I want to add is writing the AI's reasoning behind the translation to stdout but leaving it out of the output file. I'd also like to add some of the other optional features, like options to specify the AI model to use, the API key to use, and reading that data from a .env file in the same directory.
That's about it for this post. I'll be writing more in the coming weeks.
以上是构建码移的详细内容。更多信息请关注PHP中文网其他相关文章!