


Failing in the high school math test is a nightmare for many people.
If you say that your high school math test is not as good as AI, is it more difficult to accept?
Yes, the Codex from OpenAI has achieved an accuracy rate of 81.1% in MIT’s seven advanced mathematics courses, which is a decent level for MIT undergraduates.
The courses range from elementary calculus to differential equations, probability theory, and linear algebra. In addition to calculations, the questions also include drawing.
#This matter has recently been on Weibo hot search.
△ "Only" scored 81 points, and the expectations for AI are too high
Now, the latest big news comes from Google :
Not only in mathematics, our AI has even achieved the highest score in the entire science and engineering subjects!
It seems that the technology giants have reached a new level in cultivating "AI problem solvers".
Google, the latest AI question maker, took four exams.
In the mathematics competition exam MATH, only three-time IMO gold medalists have scored 90 points in the past, and ordinary computer doctors can only get about 40 points.
As for other AI question-answers, the previous best score was only 6.9 points...
But this time, Google's new AI scored 50 points, which is higher than the computer doctor.
The comprehensive exam MMLU-STEM includes mathematics, physics, chemistry, biology, electronic engineering and computer science. The difficulty of the questions reaches the high school or even college level.
This time, Google AI's "full health version" also got the highest score among all the questions, directly raising the score by about 20 points.
The primary school math problem GSM8k directly raised the score to 78 points. In comparison, GPT-3 has not passed (only 55 points).
Even for MIT undergraduate and graduate courses such as solid state chemistry, astronomy, differential equations, and special relativity, Google’s new AI can answer nearly one-third of the more than 200 questions.
The most important thing is that unlike OpenAI’s method of obtaining high scores in mathematics by relying on “programming skills”, Google AI this time is taking the approach of “thinking like a human” Luzi——
It is like a liberal arts student who only memorizes but does not do questions, but he has mastered better problem-solving skills in science and engineering.
It is worth mentioning that Lewkowycz, the first author of the paper, also shared a highlight that was not written in the paper:
Our model participated in this year’s Polish Mathematics College Entrance Examination. Scores are higher than the national average.
Seeing this, some parents can no longer sit still.
If I tell my daughter this, I am afraid that she will use AI to do her homework. But if you don’t tell her, you’re not preparing her for the future!
#In the eyes of industry insiders, reaching this level by relying only on language models without hard-coding arithmetic, logic and algebra is the most amazing thing about this research. place.
So, how is this done?
AI reads 2 million papers on arXiv
The new model Minerva is based on the general language model PaLM under the Pathway architecture.
Further training is performed on the basis of the 8 billion, 60 billion and 540 billion parameter PaLM models respectively.
Minerva’s approach to answering questions is completely different from Codex’s.
Codex’s method is to rewrite each math problem into a programming problem, and then solve it by writing code.
Minerva, on the other hand, read papers crazily and forced himself to understand mathematical symbols in the same way as natural language.
Continue training on the basis of PaLM. The new data set has three parts:
Mainly includes 2 million academic papers collected on arXiv, 60GB web pages with LaTeX formulas, and a small Some of the texts used in the PaLM training phase.
The usual NLP data cleaning process will delete all symbols and keep only pure text, resulting in incomplete formulas. For example, only Einstein’s famous mass-energy equation remains Emc2.
But this time Google retained all the formulas and went through the Transformer training program just like plain text, allowing the AI to understand symbols like it understands language.
Compared with previous language models, this is one of the reasons why Minerva performs better on mathematical problems.
But compared with AI that specializes in doing math problems, Minerva does not have an explicit underlying mathematical structure in its training, which brings a disadvantage and an advantage.
The disadvantage is that the AI may use wrong steps to get the correct answer.
The advantage is that it can be adapted to different disciplines. Even if some problems cannot be expressed in formal mathematical language, they can be solved by combining natural language understanding capabilities.
In the AI reasoning stage, Minerva also combines several new technologies recently developed by Google.
The first is the Chain of Thought thinking link prompt, which was proposed by the Google Brain team in January this year.
Specifically, when asking a question, give an example of a step-by-step answer to guide you. AI can use a similar thinking process when answering questions to correctly answer questions that would otherwise be answered incorrectly.
Then there is the Scrathpad method developed jointly by Google and MIT, which allows AI to temporarily store the intermediate results of step-by-step calculations.
Finally, there is the Majority Voting method, which was only released in March this year.
Let AI answer the same question multiple times and choose the answer that appears most frequently.
After all these techniques are used, Minerva with 540 billion parameters reaches SOTA in various test sets.
Even the 8 billion parameter version of Minerva can reach the level of the latest updated davinci-002 version of GPT-3 in competition-level mathematics problems and MIT open course problems.
Having said so much, what specific questions can Minerva solve?
Google has also opened up a sample set, let’s take a look.
It is omnipotent in mathematics, physics, chemistry, and even machine learning
In mathematics, Minerva can calculate values step by step like humans, instead of directly solving violent problems.
For word problems, you can list the equations yourself and simplify them.
You can even derive the proof.
In physics, Minerva can solve university-level questions such as finding the total spin quantum number of electrons in the neutral nitrogen ground state (Z = 7).
In biology and chemistry, Minerva can also answer various multiple-choice questions with its language understanding ability.
Which of the following point mutation forms does not have a negative impact on the protein formed from the DNA sequence?
Which of the following is a radioactive element?
And astronomy: Why does the Earth have a strong magnetic field?
In terms of machine learning, it correctly gives another way of saying this term by explaining the specific meaning of "out-of-distribution sample detection".
......
However, Minerva sometimes makes some stupid mistakes, such as eliminating the √ on both sides of the equation.
In addition, Minerva will have a "false positive" situation where the reasoning process is wrong but the result is correct, such as the following, with an 8% probability.
After analysis, the team found that the main error forms came from calculation errors and reasoning errors, and only a small part came from errors in understanding the meaning of the question and using wrong facts in the steps. Other cases.
The calculation errors can be easily solved by accessing an external calculator or Python interpreter, but other types of errors are difficult to adjust because the neural network is too large.
Overall, Minerva’s performance has surprised many people, and they have asked for APIs in the comment area (unfortunately, Google has no public plans yet).
Some netizens thought that, coupled with the "coaxing" method that made GPT-3's problem-solving accuracy soar by 61% in the past few days, its accuracy may still be Can it be improved further?
However, the author’s response is that the coaxing method belongs to zero-sample learning, and no matter how strong it is, it may not be as good as few-sample learning with 4 examples.
Some netizens also asked, since it can do questions, can it be used in reverse?
In fact, MIT has teamed up with OpenAI to use AI to set questions for college students.
They mixed questions posed by humans and questions posed by AI, and asked students to do questionnaires. It was difficult for everyone to tell whether a question was posed by AI.
In short, the current situation is except that the AI people are busy reading this paper.
Students look forward to one day being able to use AI to do their homework.
#Teachers are also looking forward to the day when they can use AI to produce test papers.
Paper address: https://storage.googleapis.com/minerva-paper/minerva_paper.pdf
Demo address: https://minerva- demo.github.io/
Related papers: Chain of Thought https://arxiv.org/abs/2201.11903Scrathpads https://arxiv.org/abs/2112.00114Majority Voting https://arxiv.org /abs/2203.11171
Reference link:
https://ai.googleblog.com/2022/06/minerva-solving-quantitative-reasoning.html
https: //twitter.com/bneyshabur/status/1542563148334596098
https://twitter.com/alewkowycz/status/1542559176483823622
The above is the detailed content of AI is going crazy when it comes to quizzes! The accuracy rate of the high-level mathematics examination is 81%, and the competition question score exceeds that of the computer science doctor. For more information, please follow other related articles on the PHP Chinese website!

ai合并图层的快捷键是“Ctrl+Shift+E”,它的作用是把目前所有处在显示状态的图层合并,在隐藏状态的图层则不作变动。也可以选中要合并的图层,在菜单栏中依次点击“窗口”-“路径查找器”,点击“合并”按钮。

ai橡皮擦擦不掉东西是因为AI是矢量图软件,用橡皮擦不能擦位图的,其解决办法就是用蒙板工具以及钢笔勾好路径再建立蒙板即可实现擦掉东西。

虽然谷歌早在2020年,就在自家的数据中心上部署了当时最强的AI芯片——TPU v4。但直到今年的4月4日,谷歌才首次公布了这台AI超算的技术细节。论文地址:https://arxiv.org/abs/2304.01433相比于TPU v3,TPU v4的性能要高出2.1倍,而在整合4096个芯片之后,超算的性能更是提升了10倍。另外,谷歌还声称,自家芯片要比英伟达A100更快、更节能。与A100对打,速度快1.7倍论文中,谷歌表示,对于规模相当的系统,TPU v4可以提供比英伟达A100强1.

ai可以转成psd格式。转换方法:1、打开Adobe Illustrator软件,依次点击顶部菜单栏的“文件”-“打开”,选择所需的ai文件;2、点击右侧功能面板中的“图层”,点击三杠图标,在弹出的选项中选择“释放到图层(顺序)”;3、依次点击顶部菜单栏的“文件”-“导出”-“导出为”;4、在弹出的“导出”对话框中,将“保存类型”设置为“PSD格式”,点击“导出”即可;

Yann LeCun 这个观点的确有些大胆。 「从现在起 5 年内,没有哪个头脑正常的人会使用自回归模型。」最近,图灵奖得主 Yann LeCun 给一场辩论做了个特别的开场。而他口中的自回归,正是当前爆红的 GPT 家族模型所依赖的学习范式。当然,被 Yann LeCun 指出问题的不只是自回归模型。在他看来,当前整个的机器学习领域都面临巨大挑战。这场辩论的主题为「Do large language models need sensory grounding for meaning and u

ai顶部属性栏不见了的解决办法:1、开启Ai新建画布,进入绘图页面;2、在Ai顶部菜单栏中点击“窗口”;3、在系统弹出的窗口菜单页面中点击“控制”,然后开启“控制”窗口即可显示出属性栏。

引入密集强化学习,用 AI 验证 AI。 自动驾驶汽车 (AV) 技术的快速发展,使得我们正处于交通革命的风口浪尖,其规模是自一个世纪前汽车问世以来从未见过的。自动驾驶技术具有显着提高交通安全性、机动性和可持续性的潜力,因此引起了工业界、政府机构、专业组织和学术机构的共同关注。过去 20 年里,自动驾驶汽车的发展取得了长足的进步,尤其是随着深度学习的出现更是如此。到 2015 年,开始有公司宣布他们将在 2020 之前量产 AV。不过到目前为止,并且没有 level 4 级别的 AV 可以在市场

ai移动不了东西的解决办法:1、打开ai软件,打开空白文档;2、选择矩形工具,在文档中绘制矩形;3、点击选择工具,移动文档中的矩形;4、点击图层按钮,弹出图层面板对话框,解锁图层;5、点击选择工具,移动矩形即可。


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Zend Studio 13.0.1
Powerful PHP integrated development environment

SublimeText3 Chinese version
Chinese version, very easy to use

SublimeText3 Linux new version
SublimeText3 Linux latest version

Notepad++7.3.1
Easy-to-use and free code editor

Dreamweaver CS6
Visual web development tools
