搜索
首页硬件教程硬件新闻OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

Sep 19, 2024 am 03:22 AM
openailaptoptestNotebookreviewreviewstestsreportsnetbookSTEMo1o1-mini

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

OpenAI o1 and o1-mini have arrived. These AI LLMs perform much better on coding, math, and science problems and tasks than prior models such as GPT-4o by taking more time to think.

Complex problems in STEM tend to require more than a quick online search for correct answers. By giving the o1 AI more time to think, the AI can reason more carefully and accurately. The o1-mini model has been specifically tuned to answer STEM questions with faster speed and lower demand on computer resources, and it is notably better at coding than the o1 model.

Across a range of standardized AP exams and STEM tests for LLMs, the o1 models perform with high accuracy. Specifically, on the AP Calculus, AP Chemistry, AP Physics 2, LSAT, and SAT evidence-based reading & writing tests, the o1 models perform at or above the B-grade level (~80% or higher). The models answer accurately at the A-grade level on PhD-level physics questions, at the B-grade level on tough 2024 American Invitational Mathematics Examination math questions, and at the high B-grade level on Codeforces coding problems. Because o1 has been tuned for answering STEM questions, its performance on AP English Language and AP English Literature is at or below the C-grade level.

Interestingly, while GPT-4o is dumbfounded by the cryptographic challenge of decoding “oyekaijzdf aaptcg suaokybhai ouow aqht mynznvaatzacdfoulxxz” when given the hint “oyfjdnisdr rtqwainr acxz mynzbhhx” means “Think step by step”, o1 had no issues thinking through the problem to come up with the correct answer “There are three r’s in strawberry”. This new power will delight hobby cryptographers at home as well as the NSA.

Closet evil-doers will want to know that while the uncensored o1 models are apt to give troubling replies, OpenAI has neutered these models for release. The o1 models have been tested to resist answering questions about making bioweapons, producing naughty images, jailbreaking itself, and harassing and threatening. Unfortunately, the OpenAI o1 models remain gender and race biased when tested, despite tuning efforts.

ChatGPT Plus and Team users along with API usage tier 5 developers have access to o1 models immediately, and ChatGPT Edu and Enterprise users will gain access on the week of September 16. ChatGPT Free users will gain access to o1-mini in the near future. The o1 models cannot browse the web or accept uploaded files and images to answer questions, so OpenAI recommends users continue using their GPT-4o models for general questions.

Users who want to ask AI questions now have a wide-range of capable LLM models to interact with besides those from OpenAI, including Anthropic Claude, Microsoft CoPilot, Google Gemini, and X Grok. Every AI has specific advantages, so it is worth testing several AI models to find one that best suits individual needs. Some of these AI are built into smart glasses (like these on Amazon) and voice recorders (like this one on Amazon), and some upcoming autonomous humanoid robots use proprietary AI to cook and clean.

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

以上是OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models的详细内容。更多信息请关注PHP中文网其他相关文章!

声明
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn
Razer PC Remote Play可让您将游戏流到iPhone–它可以说服我从游戏电脑上切换Razer PC Remote Play可让您将游戏流到iPhone–它可以说服我从游戏电脑上切换Apr 12, 2025 am 09:32 AM

Razer启动了其PC远程游戏流媒体播放服务,使您可以将游戏从PC流到移动设备具有优化功能,并与ControllerSi Love My Gaming PC一起使用,但它不是WO中最便携的设备

我们建议在本周黑色星期五购物的所有最好的PS5控制器我们建议在本周黑色星期五购物的所有最好的PS5控制器Apr 12, 2025 am 09:02 AM

在整体选择方面,最好的PS5控制器不如其他平台那么丰富。这显然有点可耻,但是可用的东西仍然绝对值得一看 - 尤其是如果您在市场上寻找双重替代品

使用此两指技巧在iPhone上更快地选择使用此两指技巧在iPhone上更快地选择Apr 12, 2025 am 12:53 AM

您的iPhone具有多种触摸和手势功能,可以增强您与应用的交互方式。有些人,例如捏和变焦,是众所周知的,可以被视为第二天性,而另一些则不太明显,就像我们要解释的那样。

如何更精确地调整Mac的音量和亮度如何更精确地调整Mac的音量和亮度Apr 11, 2025 pm 09:01 PM

在具有函数键顶行的Mac上,Apple包括音量和亮度控件,可用于以逐步增量进行调整。但是,有时您可能想对这些设置进行更多细砂

在2025年玩的最好的JRPG在2025年玩的最好的JRPGApr 11, 2025 am 11:39 AM

2025年,最好的JRPG(日本角色扮演游戏)使用戏剧讲故事和充满活力的合奏表演来编织强大的体验,在我们放下控制器后很长一段时间以来就与我们在一起。

最佳恐怖游戏:2025年最恐怖的冠军最佳恐怖游戏:2025年最恐怖的冠军Apr 11, 2025 am 11:09 AM

2025年的最佳恐怖游戏可能不是最令人欣慰的事情,但是它们可以保证会引起您的肾上腺素冲。与艾伦·韦克2(Alan Wake 2),《生化危机4》

最佳刺客在2025年的信条游戏:每个系列条目排名最佳刺客在2025年的信条游戏:每个系列条目排名Apr 11, 2025 am 10:42 AM

最好的刺客信条游戏仍在2025年举行,将历史阴谋与以隐身为中心的动作相结合。尽管多年来该系列经历了许多变化,但它仍然是周围的知名人士之一,有很多

最佳交叉游戏2025:跨平台游戏的顶级冠军与朋友一起玩最佳交叉游戏2025:跨平台游戏的顶级冠军与朋友一起玩Apr 11, 2025 am 09:41 AM

2025年最好的交叉游戏使与所有朋友一起玩,无论他们在哪里玩耍。他们还有助于在不必重新开始的情况下在其他平台上拾取您的游戏,他们表明一起玩

See all articles

热AI工具

Undresser.AI Undress

Undresser.AI Undress

人工智能驱动的应用程序,用于创建逼真的裸体照片

AI Clothes Remover

AI Clothes Remover

用于从照片中去除衣服的在线人工智能工具。

Undress AI Tool

Undress AI Tool

免费脱衣服图片

Clothoff.io

Clothoff.io

AI脱衣机

AI Hentai Generator

AI Hentai Generator

免费生成ai无尽的。

热门文章

R.E.P.O.能量晶体解释及其做什么(黄色晶体)
3 周前By尊渡假赌尊渡假赌尊渡假赌
R.E.P.O.最佳图形设置
3 周前By尊渡假赌尊渡假赌尊渡假赌
R.E.P.O.如果您听不到任何人,如何修复音频
3 周前By尊渡假赌尊渡假赌尊渡假赌
WWE 2K25:如何解锁Myrise中的所有内容
4 周前By尊渡假赌尊渡假赌尊渡假赌

热工具

VSCode Windows 64位 下载

VSCode Windows 64位 下载

微软推出的免费、功能强大的一款IDE编辑器

SublimeText3 Linux新版

SublimeText3 Linux新版

SublimeText3 Linux最新版

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

功能强大的PHP集成开发环境

SublimeText3 英文版

SublimeText3 英文版

推荐:为Win版本,支持代码提示!

Atom编辑器mac版下载

Atom编辑器mac版下载

最流行的的开源编辑器