search
HomeTechnology peripheralsAIThe future of programmers belongs to 'pseudocode'! Nature column: Three ways to use ChatGPT to accelerate scientific research programming

The emergence of chatbots based on generative artificial intelligence tools, such as ChatGPT, Bard, and how to use AI tools for academic research has caused huge controversy, but at the same time, AI-generated code is used in science The value of research is ignored.

Compared with the plagiarism problem caused by ChatGPT generated text, using AI to copy code is obviously less controversial. Open Science even encourages "code sharing" and "code reuse", and it is also easy to trace the source. Convenient, for example, using "import" in Python to import dependent packages is considered a reference.

Recently, a review article was published in Nature. The author team discussed the three potential capabilities of ChatGPT in the field of scientific programming, including brainstorming and decomposing complex tasks. , and handle simple but time-consuming tasks.

The future of programmers belongs to pseudocode! Nature column: Three ways to use ChatGPT to accelerate scientific research programming

Article link: https://www.nature.com/articles/s41559-023-02063 -3

Researchers explored the capabilities and limitations of using generative AI to enhance scientific coding by using ChatGPT to translate natural language into computer-readable code.

The examples in the experiment mainly explored common tasks that may be related to ecology, evolution and other fields. The researchers found that 80%-90% of the coding tasks can be completed using ChatGPT .

ChatGPT can generate very useful code if the task is broken down into small, manageable chunks of code, with precise hints as queries.

It is worth noting that conducting the same experiment with Google's Bard will usually yield similar results, but with more errors in the code, so this article mainly uses ChatGPT for experiments.

The first author, Cory Merow, is a quantitative ecologist whose main research direction is to build mechanism models to predict the responses of populations and communities to environmental changes. Even the best datasets are imperfect at predicting responses to global change, so tools need to be developed to combine data sources and explore datasets to gain insights into possible changes in biological systems.

ChatGPT helps scientific coding

ChatGPT is based on the regression model GPT-3 and performs fitting training on massive web pages, books and other texts, without searching Text can be generated.

So ChatGPT is better at interpolating (predicting text that is similar to the training data), but not good at extrapolating (predicting new text that is different from the training sample).

The sheer size of the training set is an advantage and means that GPT-3 has seen a large number of language patterns, allowing it to interpolate and increase the likelihood of generating replies that are useful to humans.

But for code generation tasks, GPT-3 does not know how to program, it just knows what the code looks like and which words are most likely to appear next. Its job The principle is similar to automatic completion. It predicts the next code block (chunk) based on a probability model. The chunk is usually smaller than the word (word). It can also be called token

The probability of generating the correct token Based on the product of probabilities of all tokens, increasing the number of predicted tokens or reducing the certainty of selected tokens will increase the difficulty of the task, thereby reducing the probability of obtaining the correct token.

Therefore, if you want to increase the probability of the correct token, you need to shorten the length of the generation task or provide more specific instructions.

Finally, the researcher reminds that some of the text generated by ChatGPT looks like code, but may not be executable, so careful observation and debugging are required during the coding process.

Brainstorming tool

ChatGPT can be a great Retrieve multiple data sources. For example, in the ecological field, plant traits, species distribution areas, and meteorological data can be obtained simultaneously.

Although some of the data provided by ChatGPT is incorrect, these errors can be quickly corrected through the links it provides.

However, ChatGPT cannot write a crawler to download data from the website. This may be because the R language package and the underlying application program interface (such as the protocol for R to access the database) are updated too quickly. After all, ChatGPT The training data was constructed in 2021.

ChatGPT can propose various statistical techniques when encountering specific problems. In subsequent questions, it can generate more guidance based on user assumptions and provide an initial code.

However, the synthesis process is only suitable for proposing and communicating ideas, and fact-checking through traditional data sources (such as papers, etc.) is still required.

It should be noted that some websites claim that ChatGPT has the ability to write summaries of books. However, judging from the test results of the researchers, the comprehensive results of this summary are completely wrong, possibly because The books used for testing do not appear in the GPT-3 training set.

Harder tasks require more debugging

ChatGPT is very good at generating template code, providing a short script code containing a small number of functions under specific instructions.

For example, in the example below, the researchers asked ChatGPT to string together the inputs and outputs of four commonly used functions. and provide a sample code that uses this function on simulated data.

You can see that the results generated by ChatGPT are almost perfect. It only took a few minutes to debug the code. However, you need to explain the query very specifically in the prompt, including providing the naming and used function.

The future of programmers belongs to pseudocode! Nature column: Three ways to use ChatGPT to accelerate scientific research programming

Researchers found that the key to success is:

1. Decompose complex tasks into multiple subtasks, and each subtask preferably only requires a few steps to complete. After all, the code generated by ChatGPT is based on the results of the probabilistic text prediction model.

2. ChatGPT performs best when using existing functions, because it only involves interpolation rather than extrapolation.

For example, code that uses regular expressions (regex) to extract information from text is very difficult for many developers, but because there are already regular regex websites that provide a large number of online examples , and may appear in the ChatGPT example, so the performance of ChatGPT writing regular expressions is still good.

3. One of the biggest criticisms of ChatGPT by academics is the lack of transparency in its information sources.

For code generation tasks, a certain degree of transparency can be achieved by specifying a "namespace", that is, explicitly calling the package name when using the function.

However, ChatGPT may directly copy an individual’s public code without citing it, and researchers are still responsible for verifying the correct code attribution.

At the same time, if you are required to generate longer scripts, some flaws of ChatGPT will be exposed, such as forged function names or parameters, etc. This is why StackOverflow disables ChatGPT to generate code.

But if the user provides a clear set of execution steps, ChatGPT can still generate a useful workflow template that defines the connections between inputs and outputs between steps, which may is the most useful way to generate new code using GPT-3 extrapolation.

Currently ChatGPT cannot convert pseudocode (algorithm steps described in simple language) into perfect computer executable code, but this may not be far from reality.

ChatGPT is particularly helpful for beginners and unfamiliar programming languages, because beginners can only write some shorter scripts, making debugging more convenient.

ChatGPT is better at non-creative tasks

ChatGPT is best at solving time-consuming tasks Formulated tasks for debugging, detecting, and explaining errors in your code.

ChatGPT is also very effective when writing function documents. For example, using roxygen 2's inline document syntax is very efficient in identifying all parameters and classes, but it rarely explains how to use it. function.

A key limitation is that ChatGPT generation is limited to approximately 500 words and can only focus on the generation of smaller code blocks, while also generating unit tests to automatically confirm code functionality .

Most of the advice given by ChatGPT is helpful in defining the structure of the test and checking the expected object classes.

Finally, ChatGPT is very effective at reformatting code to follow standardized (e.g. Google) code styles.

The future belongs to pseudocode

ChatGPT and other AI-driven natural language processing tools are ready to automate simple tasks for developers, such as writing short functions, Syntax debugging, annotation and formatting, while extension complexity depends on the user's willingness to debug (and their proficiency).

The researchers summarized the functions of ChatGPT in code generation, which can simplify the code writing process in the scientific field. However, manual inspection is still necessary, and runnable code does not necessarily mean The code is able to perform its intended tasks, so unit tests or informal interactive tests are still critical.

The future of programmers belongs to pseudocode! Nature column: Three ways to use ChatGPT to accelerate scientific research programming

Ensure correct code attribution in cases where the solution may have been developed by a human and generated by a simple copy of ChhatGPT People matter.

Already, there are chatbots that are starting to automatically provide links to their sources (e.g., Microsoft’s Bing), although this step is still in its infancy.

ChatGPT offers an alternative way to learn coding skills compared to traditional methods, easing the hurdles of the initial task of writing by converting pseudocode directly into code.

The researchers suspect that future advances will use tools like ChatGPT to automatically debug the code written, iteratively generating, running and proposing new code based on the errors encountered during the experiment. , the researchers found that the ability to correct the code was limited, only occasionally successful when very specific instructions were targeted at small blocks of code, and the debugging process was far less efficient than manual debugging.

The researchers suspect that automated debugging will improve as technology advances (such as the recently released GPT-4 model, which is said to be 10 times larger than the GPT-3 model).

The future is almost here and now is the time for developers to learn prompt engineering skills to take advantage of emerging AI tools. Researchers predict that code generated using artificial intelligence will become an increasingly important factor in all aspects of software development. increasingly valuable skills that are fundamental to scientific discovery and understanding.

The above is the detailed content of The future of programmers belongs to 'pseudocode'! Nature column: Three ways to use ChatGPT to accelerate scientific research programming. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
Word文本框没有旋转按钮怎么办Word文本框没有旋转按钮怎么办Dec 08, 2022 am 09:50 AM

Word文本框没有旋转按钮的解决办法:打开兼容模式文档后按F12键另存为高版本,再打开就可以了。

令人惊艳的4个ChatGPT项目,开源了!令人惊艳的4个ChatGPT项目,开源了!Mar 30, 2023 pm 02:11 PM

自从 ChatGPT、Stable Diffusion 发布以来,各种相关开源项目百花齐放,着实让人应接不暇。今天,着重挑选几个优质的开源项目分享给大家,对我们的日常工作、学习生活,都会有很大的帮助。

Word文档拆分后的子文档字体格式变了怎么办Word文档拆分后的子文档字体格式变了怎么办Feb 07, 2023 am 11:40 AM

Word文档拆分后的子文档字体格式变了的解决办法:1、在大纲模式拆分文档前,先选中正文内容创建一个新的样式,给样式取一个与众不同的名字;2、选中第二段正文内容,通过选择相似文本的功能将剩余正文内容全部设置为新建样式格式;3、进入大纲模式进行文档拆分,操作完成后打开子文档,正文字体格式就是拆分前新建的样式内容。

学术专用版ChatGPT火了,一键完成论文润色、代码解释、报告生成学术专用版ChatGPT火了,一键完成论文润色、代码解释、报告生成Apr 04, 2023 pm 01:05 PM

用 ChatGPT 辅助写论文这件事,越来越靠谱了。 ChatGPT 发布以来,各个领域的从业者都在探索 ChatGPT 的应用前景,挖掘它的潜力。其中,学术文本的理解与编辑是一种极具挑战性的应用场景,因为学术文本需要较高的专业性、严谨性等,有时还需要处理公式、代码、图谱等特殊的内容格式。现在,一个名为「ChatGPT 学术优化(chatgpt_academic)」的新项目在 GitHub 上爆火,上线几天就在 GitHub 上狂揽上万 Star。项目地址:https://github.com/

30行Python代码就可以调用ChatGPT API总结论文的主要内容30行Python代码就可以调用ChatGPT API总结论文的主要内容Apr 04, 2023 pm 12:05 PM

阅读论文可以说是我们的日常工作之一,论文的数量太多,我们如何快速阅读归纳呢?自从ChatGPT出现以后,有很多阅读论文的服务可以使用。其实使用ChatGPT API非常简单,我们只用30行python代码就可以在本地搭建一个自己的应用。 阅读论文可以说是我们的日常工作之一,论文的数量太多,我们如何快速阅读归纳呢?自从ChatGPT出现以后,有很多阅读论文的服务可以使用。其实使用ChatGPT API非常简单,我们只用30行python代码就可以在本地搭建一个自己的应用。使用 Python 和 C

vscode配置中文插件,带你无需注册体验ChatGPT!vscode配置中文插件,带你无需注册体验ChatGPT!Dec 16, 2022 pm 07:51 PM

​面对一夜爆火的 ChatGPT ,我最终也没抵得住诱惑,决定体验一下,不过这玩意要注册需要外国手机号以及科学上网,将许多人拦在门外,本篇博客将体验当下爆火的 ChatGPT 以及无需注册和科学上网,拿来即用的 ChatGPT 使用攻略,快来试试吧!

用ChatGPT秒建大模型!OpenAI全新插件杀疯了,接入代码解释器一键get用ChatGPT秒建大模型!OpenAI全新插件杀疯了,接入代码解释器一键getApr 04, 2023 am 11:30 AM

ChatGPT可以联网后,OpenAI还火速介绍了一款代码生成器,在这个插件的加持下,ChatGPT甚至可以自己生成机器学习模型了。 ​上周五,OpenAI刚刚宣布了惊爆的消息,ChatGPT可以联网,接入第三方插件了!而除了第三方插件,OpenAI也介绍了一款自家的插件「代码解释器」,并给出了几个特别的用例:解决定量和定性的数学问题;进行数据分析和可视化;快速转换文件格式。此外,Greg Brockman演示了ChatGPT还可以对上传视频文件进行处理。而一位叫Andrew Mayne的畅销作

ChatGPT教我学习PHP中AOP的实现(附代码)ChatGPT教我学习PHP中AOP的实现(附代码)Mar 30, 2023 am 10:45 AM

本篇文章给大家带来了关于php的相关知识,其中主要介绍了我是怎么用ChatGPT学习PHP中AOP的实现,感兴趣的朋友下面一起来看一下吧,希望对大家有帮助。

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.