


The open source version of AI programmers is here: GPT-4 blessing, ability comparable to Devin, 1.4k Stars a day
To learn more about AIGC, please visit:
51CTO AI.x Community
https://www.51cto.com/ aigc/
Recently, many people are worried about AI replacing their jobs.
Devin, the "first AI programmer" who became popular in the AI circle last month, has mastered full-stack skills by using large model capabilities. He only needs humans to give natural language instructions. Automate complex coding tasks.
The tool capabilities demonstrated by Devin are very amazing, especially for this startup company that takes the closed source route. Currently, only a few people can use this closed beta quota.
On Tuesday, researchers from the Princeton University NLP Group released SWE-agent, an open source version of the AI programmer, which received thousands of GitHub stars in less than a day. . This SWE-agent is based on deep learning technology and can automatically write efficient and reliable code. His release attracted widespread attention, and many developers expressed high recognition of his technology and performance. These achievements also prove the advancement of AI research in the field of NLP
SWE-agent is a new system for autonomously solving problems in GitHub repositories. It achieved similar accuracy to Devin on SWE-bench, taking an average of 93 seconds.
- Project website: https://swe-agent.com/
- GitHub :https://github.com/princeton-nlp/SWE-agent
John Yang, the author of the project, said that preprints of related papers The version will also be uploaded on April 10th.
In principle, SWE-agent can fix bugs and issues in real GitHub repositories by turning large models (such as GPT-4) into software engineering agents.
On the complete SWE-bench test set, SWE-agent solved 12.29% of the problems and achieved SOTA performance.
To provide automation during development, SWE-agent works by interacting with a dedicated terminal, which can open, search file contents, use automatic Syntax check, edit specific lines, and also write and execute tests.
The developers of this project carefully designed the UI interface and introduced it on GitHub.
Agent-Computer Interface (ACI)
The research team designed simple Large Model (LM)-centric commands and feedback format that enables large models to more easily browse repositories, view, edit, and execute code files, known as the Agent-Computer Interface (ACI). The research team also built a SWE agent repository to easily iterate on ACI designs of repository-level coded agents.
Just like language models require good prompt engineering, good ACI design will lead to better results when using agents. The baseline agent without well-tuned ACI performs much worse than the SWE-agent.
SWE-agent contains features that the research team found to be very useful during the design of the agent-computer interface, including:
1. Add a linter that runs when an edit command is issued and won't let the edit command go through if the code syntax is incorrect.
2. Provide the agent with a purpose-built file viewer. The research team found that this file viewer works best when it displays only 100 lines per round, and that the file editor has commands for scrolling up and down and performing searches within the file.
3. Provide specially built directory-wide string search commands for agents. The research team found it important that the tool lists matches succinctly—just list every file that has at least one match. The study showed that showing the model more context about each match would be too confusing for the model.
4. When the output of the command is empty, return a message: "Your command ran successfully, but did not produce any output."
Future published papers will detail more information.
Installation and use
To use SWE-agent, you must first set the following conditions:
1. Install Docker , and start Docker locally;
2. Install Miniconda, and use conda env create -fenvironment.yml to create the swe-agent environment;
3. Use conda activate swe-agent to activate;
4. Run ./setup.sh to create the swe-agent docker image;
5. Create a keys.cfg file in the root directory of this repository and fill in the following content:
OPENAI_API_KEY: 'OpenAI API Key Here if using OpenAI Model (optional)'ANTHROPIC_API_KEY: 'Anthropic API Key Here if using Anthropic Model (optional)'GITHUB_TOKEN: 'GitHub Token Here (required)'
The SWE-agent pipeline consists of two steps:
- Step 1: SWE-agent receives the input GitHub issue and returns a pull request to try to fix it;
- Step 2: Evaluate the pull request to verify that it actually solves the issue (currently only available for issues in the SWE-bench benchmark).
If you want to run and evaluate on the entire SWE-bench, the easiest way is to use an x86 machine.
python run.py --model_name gpt4 \--data_path https://github.com/pvlib/pvlib-python/issues/1603 --config_file config/default_from_url.yaml
python run.py --model_name gpt4 \--per_instance_cost_limit 2.00 \--config_file ./config/default.yaml
If you want to run a single question in SWE-bench, you can use --instance_filter:
python run.py --model_name gpt4 \--instance_filter marshmallow-code__marshmallow-1359
To learn more about AIGC, please visit:
51CTO AI.x Community
https://www.51cto.com/ aigc/
The above is the detailed content of The open source version of AI programmers is here: GPT-4 blessing, ability comparable to Devin, 1.4k Stars a day. For more information, please follow other related articles on the PHP Chinese website!

Introduction In prompt engineering, “Graph of Thought” refers to a novel approach that uses graph theory to structure and guide AI’s reasoning process. Unlike traditional methods, which often involve linear s

Introduction Congratulations! You run a successful business. Through your web pages, social media campaigns, webinars, conferences, free resources, and other sources, you collect 5000 email IDs daily. The next obvious step is

Introduction In today’s fast-paced software development environment, ensuring optimal application performance is crucial. Monitoring real-time metrics such as response times, error rates, and resource utilization can help main

“How many users do you have?” he prodded. “I think the last time we said was 500 million weekly actives, and it is growing very rapidly,” replied Altman. “You told me that it like doubled in just a few weeks,” Anderson continued. “I said that priv

Introduction Mistral has released its very first multimodal model, namely the Pixtral-12B-2409. This model is built upon Mistral’s 12 Billion parameter, Nemo 12B. What sets this model apart? It can now take both images and tex

Imagine having an AI-powered assistant that not only responds to your queries but also autonomously gathers information, executes tasks, and even handles multiple types of data—text, images, and code. Sounds futuristic? In this a

Introduction The finance industry is the cornerstone of any country’s development, as it drives economic growth by facilitating efficient transactions and credit availability. The ease with which transactions occur and credit

Introduction Data is being generated at an unprecedented rate from sources such as social media, financial transactions, and e-commerce platforms. Handling this continuous stream of information is a challenge, but it offers an


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

WebStorm Mac version
Useful JavaScript development tools