本週,我一直在開發一個名為 codeshift 的命令列工具,它可以讓使用者輸入原始碼文件,選擇程式語言,並將其翻譯成他們選擇的語言。
幕後並沒有什麼花哨的東西 - 它只是使用名為 Groq 的 AI 提供者來處理翻譯 - 但我想了解開發過程、它的使用方式以及它提供的功能。
將原始碼檔案轉換為任何語言的命令列工具。
codeshift [-o ]
codeshift -o index.go go Examples/index.js
codeshift [-o ]
例如,要將檔案examples/index.js 翻譯為Go 並將輸出儲存到index.go:
codeshift -o index.go go Examples/index.js
我一直致力於這個項目,作為安大略省多倫多塞內卡理工學院開源開發主題課程的一部分。一開始,我想堅持使用我熟悉的技術,但該專案的說明鼓勵我們學習新的東西,例如新的程式語言或新的運行時。
雖然我一直想學習 Java,但在網路上做了一些研究後,它似乎不是開發 CLI 工具或與 AI 模型互動的最佳選擇。它沒有得到 OpenAI 的正式支持,並且其文件中的社群庫已被棄用。
我一直堅持使用流行技術 - 它們往往很可靠,並且擁有完整的文檔和大量線上資訊。但這一次,我決定採取不同的做法。我決定使用 Bun,這是一個很酷的新 JavaScript 運行時,旨在取代 Node。
事實證明我應該堅持我的直覺。我在嘗試編譯我的專案時遇到了麻煩,我所能做的就是希望開發人員能夠解決這個問題。
之前在這裡引用過,未解決就關閉:https://github.com/openai/openai-node/issues/903
這是一個相當大的問題,因為它會阻止在使用最新的 Sentry 監控套件時使用 SDK。
import * as Sentry from '@sentry/node';
// Start Sentry
Sentry.init({
dsn: "https://your-sentry-url",
environment: "your-env",
tracesSampleRate: 1.0, // Capture 100% of the transactions
});
const params = {
model: model,
stream: true,
stream_options: {
include_usage: true
},
messages
};
const completion = await openai.chat.completions.create(params);
Results in error:
TypeError: getDefaultAgent is not a function at OpenAI.buildRequest (file:///my-project/node_modules/openai/core.mjs:208:66) at OpenAI.makeRequest (file:///my-project/node_modules/openai/core.mjs:279:44)
(Included)
All operating systems (macOS, Linux)
v20.10.0
v4.56.0
This turned me away from Bun. I'd found out from our professor we were going to compile an executable later in the course, and I did not want to deal with Bun's problems down the line.
So, I switched to Node. It was painful going from Bun's easy-to-use built-in APIs to having to learn how to use commander for Node. But at least it wouldn't crash.
I had previous experience working with AI models through code thanks to my co-op, but I was unfamiliar with creating a command-line tool. Configuring the options and arguments turned out to be the most time-consuming aspect of the project.
Apart from the core feature we chose for each of our projects - mine being code translation - we were asked to implement any two additional features. One of the features I chose to implement was to save output to a specified file. Currently, I'm not sure this feature is that useful, since you could just redirect the output to a file, but in the future I want to use it to extract the code from the response to the file, and include the AI's rationale behind the translation in the full response to stdout. Writing this feature also helped me learn about global and command-based options using commander.js. Since there was only one command (run) and it was the default, I wanted the option to show up in the default help menu, not when you specifically typed codeshift help run, so I had to learn to implement it as a global option.
I also ended up "accidentally" implementing the feature for streaming the response to stdout. I was at first scared away from streaming, because it sounded too difficult. But later, when I was trying to read the input files, I figured reading large files in chunks would be more efficient. I realized I'd already implemented streaming in my previous C++ courses, and figuring it wouldn't be too bad, I got to work.
Then, halfway through my implementation I realized I'd have to send the whole file at once to the AI regardless.
But this encouraged me to try streaming the output from the AI. So I hopped on MDN and started reading about ReadableStreams and messing around with ReadableStreamDefaultReader.read() for what felt like an hour - only to scroll down the AI provider's documentation and realize all I had to do was add stream: true to my request.
Either way, I may have taken the scenic route but I ended up implementing streaming.
Right now, the program parses each source file individually, with no shared context. So if a file references another, it wouldn't be reflected in the output. I'd like to enable it to have that context eventually. Like I mentioned, another feature I want to add is writing the AI's reasoning behind the translation to stdout but leaving it out of the output file. I'd also like to add some of the other optional features, like options to specify the AI model to use, the API key to use, and reading that data from a .env file in the same directory.
That's about it for this post. I'll be writing more in the coming weeks.
以上是建構碼移的詳細內容。更多資訊請關注PHP中文網其他相關文章!