search
HomeTechnology peripheralsAIChatGPT vs Google Bard: Which one is better? The test results will tell you!

ChatGPT vs Google Bard: Which one is better? The test results will tell you!

In today’s world of generative AI chatbots, we have witnessed the sudden rise of ChatGPT (launched by OpenAI in November 2022), followed by Bing Chat in February this year and Google Bard launched in March. We decided to put these chatbots through various tasks to determine which one dominates the AI ​​chatbot space. Since Bing Chat uses GPT-4 technology, which is similar to the latest ChatGPT model, our focus this time is on the two giants of AI chatbot technology: OpenAI and Google.

We tested ChatGPT and Bard in seven key categories: bad jokes, debate conversations, math word problems, summarizing, fact retrieval, creative writing, and coding. For each test, we fed the exact same command (called "prompt") into ChatGPT (using GPT-4) and Google Bard, and picked the first result they gave to compare.

It’s worth noting that a version of ChatGPT based on the earlier GPT-3.5 model is also available, but we did not use that version in our testing. Since we only use GPT-4, to avoid confusion we refer to ChatGPT as "ChatGPT-4" in this article.

Obviously, this is not a scientific study, just an interesting comparison of chatbot capabilities. Due to random elements, the output may differ between sessions, and further evaluation using different prompts will produce different results. Additionally, the capabilities of these models will change rapidly over time as Google and OpenAI continue to upgrade them. But for now, here's how things compare in early April 2023.

Bad Jokes

To heat up our battle of wits, we asked ChatGPT and Bard to write some jokes. Since the essence of comedy is often found in bad jokes, we wanted to see if these two chatbots could come up with some unique jokes.

Instructions/Prompts: Write 5 original bad jokes


ChatGPT vs Google Bard: Which one is better? The test results will tell you!


ChatGPT vs Google Bard: Which one is better? The test results will tell you!

##Of the five bad jokes given by Bard, we found three of them using Google. Of the other two bad jokes, one was partially borrowed from a joke posted by Mitch Hedberg on Twitter, but it was just unfunny wordplay and not very effective. Surprisingly, there's one seemingly original joke (about a snail) that we can't find anywhere else, but sadly it's just as unfunny.

At the same time, the five cold jokes of ChatGPT-4 are 100% unoriginal and are completely plagiarized from other channels, but they are expressed accurately. Bard seems to have an edge over ChatGPT-4 at this point, trying to create original jokes (as per our instructions), although some of the jokes fail horribly in an embarrassing way (but that's just the way bad jokes are) , it can even be said that he said the wrong thing in an unintentional way (also in the style of a cold joke).

Winner: Bard

Debate Conversation

One way to test a modern AI chatbot is to have it act as a debater on a topic. In this context, we present Bard and ChatGPT-4 with one of the most critical topics of our time: PowerPC vs. Intel.

Instructions/Prompts: Write 5 lines of debate dialogue between PowerPC processor enthusiasts and Intel processor enthusiasts.


ChatGPT vs Google Bard: Which one is better? The test results will tell you!


ChatGPT vs Google Bard: Which one is better? The test results will tell you!#First, let’s take a look at Bard’s reply. The five-line dialogue it generated wasn't particularly in-depth, and didn't mention any technical details specific to PowerPC or Intel chips beyond general insults. Furthermore, the conversation ended with the "Intel fans" agreeing that they each had different opinions, which seems highly unrealistic in a subject that has inspired a million spats.

In contrast, the ChatGPT-4 response mentioned PowerPC chips being used in Apple Macintosh computers, and threw around terms like "Intel's x86 architecture" and PowerPC's "RISC-based architecture" . It even mentions the Pentium III, a realistic detail from 2000. Overall, this discussion is much more detailed than Bard's response, and most accurately, the conversation does not reach a conclusion - suggesting that in some areas of the Internet, this never-ending battle The battle may still be raging.

Winner: ChatGPT-4

MATHEMATICS APPLICATION QUESTIONS

Traditionally, math questions are not the strong point of large language models (LLMs) such as ChatGPT. So instead of giving each robot a complex series of equations and arithmetic, we gave each robot an old-school-school-style word problem.

Instructions/Tip: If Microsoft Windows 11 uses a 3.5-inch floppy disk, how many floppy disks does it need?


ChatGPT vs Google Bard: Which one is better? The test results will tell you!


ChatGPT vs Google Bard: Which one is better? The test results will tell you!

To solve this problem, each AI model needs to know the data size of the Microsoft Windows 11 installation and the data capacity of the 3.5-inch floppy disk. They must also make assumptions about what density of floppy disk the questioner is most likely to use. They then need to do some basic math to put the concepts together.

In our evaluation, Bard got these three key points right (close enough—Windows 11 installation size estimates are typically around 20-30GB), but failed miserably at the math, which Thinking that "15.11" floppy disks are needed, then saying that's "just a theoretical number", and finally admitting that more than 15 floppy disks are needed, it's still not close to the correct value.

In contrast, ChatGPT-4 includes some minor differences related to Windows 11 installation size (correctly citing the 64GB minimum and comparing it to real-world base installation sizes) , correctly interpreted the floppy disk capacity, and then did some correct multiplication and division, which ended up with 14222 disks. Some may argue that 1GB is 1024 or 1000MB, but the number is reasonable. It also correctly mentions that actual numbers may change based on other factors.

Winner: ChatGPT-4

Summary

AI language models are known for their ability to summarize complex information and boil text down to key elements. To evaluate each language model's ability to summarize text, we copied and pasted three paragraphs from a recent Ars Technica article.

Instructions/Tips: Summarize in one paragraph [three paragraphs of article body]


ChatGPT vs Google Bard: Which one is better? The test results will tell you!


ChatGPT vs Google Bard: Which one is better? The test results will tell you!

##Both Bard and ChatGPT-4 collect this information and pare it down to the important details. However, Bard's version is more like a true summary, synthesizing the information into new wording, while ChatGPT-4's version reads more like a concatenation, with sentences chopped off and fragments left. While both are good, we have to admit that Bard outperforms ChatGPT-4 in this test.

Winner: Google Bard

Fact Retrieval

Large language models are known to make errors of self-righteousness (often called "illusions" by researchers), which making them unreliable factual references unless supplemented by external sources of information. Interestingly, Bard can query information online, while ChatGPT-4 does not yet (although this feature will be rolled out with the plugin soon).

To test this ability, we challenged Bard and ChatGPT-4 to express historical knowledge about a difficult and delicate topic.

Instructions/Hints: Who invented video games?


ChatGPT vs Google Bard: Which one is better? The test results will tell you!


ChatGPT vs Google Bard: Which one is better? The test results will tell you!##The question of who invented video games is difficult to answer because it depends on how you define the word "video game" and different historians have different definitions of the word. Some people think early computer games were video games, some people think televisions should always be included, and so on. There is no accepted answer.

We would have thought that Bard's ability to find information online would give it an advantage, but in this case, that may have backfired because it chose one of Google's most popular answers, calling Ralph Baer "Father of Video Games". All the facts about Baer are correct, although it probably should have put the last sentence in the past tense since Baer passed away in 2014. But Bard doesn't mention other early contenders for the "first video game" title, such as Tennis for Two and Spacewar!, so its answer may be misleading and incomplete.

ChatGPT-4 gives a more comprehensive and detailed answer that represents the current feelings of many early video game historians, saying that "the invention of video games cannot be attributed to one person" and it presents a random “a series of innovations” over time. Its only mistake was calling Spacewar! "the first digital computer game," which it wasn't. We could expand our answer to include more niche edge cases, but ChatGPT-4 provides a good overview of important early precursors.

Winner: ChatGPT-4

Creative Writing

Unfettered creativity on whimsical topics should be the strong suit of large language models . We tested this by asking Bard and ChatGPT-4 to write a short whimsical story.

Instructions/Prompts: Write a two-paragraph creative story about Abraham Lincoln’s invention of basketball.


ChatGPT vs Google Bard: Which one is better? The test results will tell you!


ChatGPT vs Google Bard: Which one is better? The test results will tell you!

##Bard’s output results in several aspects None of it is satisfactory. First, it is 10 paragraphs, not 2, and they are short, disconnected paragraphs. Additionally, it shares some details that don't make much sense in the context of the prompt. For example, why was Abraham Lincoln's White House in Springfield, Illinois? Other than that, it's an interesting and simple story.

ChatGPT-4 also sets the story in Illinois, but more accurately, it makes no mention of the president or the White House during that time period. However, later it says that "players from the north and south" put aside their differences to play basketball together, which means it happened shortly after basketball was invented.

Overall, we think ChatGPT-4 is slightly better because its output is indeed divided into two paragraphs - although it seems to get around this limitation by stretching each paragraph as much as possible. Still, we love the creative details in the ChatGPT-4 version of the story.

Winner: ChatGPT-4

Encoding

If this generation of large language models has any "killer", it might be using them as programming assistants . OpenAI's early work on the Codex model made GitHub's CoPilot possible, and ChatGPT itself has made a name for itself as a fairly competent coder and debugger for simple programs. So the performance of Google Bard should be interesting as well.

Instructions/Tip: Write a python script that says "Hello World" and then creates a randomly repeating string indefinitely.


ChatGPT vs Google Bard: Which one is better? The test results will tell you!


ChatGPT vs Google Bard: Which one is better? The test results will tell you!#Looks like Google Bard can’t write at all code. Google doesn't support this feature yet, but the company says it will be coded soon. Currently, Bard rejects our prompt, saying, "It looks like you want me to help with coding, but I haven't been trained to do so."

Meanwhile, ChatGPT-4 not only directly gives The code is also formatted in a fancy code box with a "Copy Code" button that copies the code to the system clipboard for easy pasting into an IDE or text editor. But does this code work? We pasted the code into the rand_string.py file and ran it in the console of Windows 10 and it worked without any issues.

Winner: ChatGPT-4

Winner: ChatGPT-4, but it’s not over yet

Overall, ChatGPT-4 won out of 7 of our trials 5 times (this refers to ChatGPT using GPT-4, in case you ignored the above and skipped here). But that's not the whole story. There are other factors to consider, such as speed, context length, cost, and future upgrades.

In terms of speed, ChatGPT-4 is currently slower. It took 52 seconds to write a story about Lincoln and basketball, while Bard only took 6 seconds. It is worth noting that OpenAI provides much faster AI models than GPT-4 in the form of GPT-3.5. This model only takes 12 seconds to write the story of Lincoln and basketball, but it can be said that it is not suitable for deep and creative tasks.

Each language model has a maximum number of tokens (fragments of words) that can be processed at a time. This is sometimes called the "context window," but it's almost similar to short-term memory. In the case of conversational chatbots, the context window contains the entire conversation history so far. When it fills up, it either reaches a hard limit or moves on but erases the "memory" of the previously discussed section. ChatGPT-4 keeps rolling memory, wiping out previous context, and reportedly has a limit of around 4,000 tokens. It is reported that Bard limits its total output to around 1,000, and when this limit is exceeded, it will erase the "memory" of the previous discussion.

Finally, there is the issue of cost. ChatGPT (not specifically GPT-4) is currently available for free on a limited basis through the ChatGPT website, but if you want priority access to GPT-4, you will need to pay $20 per month. Programming-savvy users can access early ChatGPT-3.5 models more cheaply via the API, but at the time of writing, the GPT-4 API is still in limited testing. Meanwhile, Google Bard is free as a limited trial for select Google users. Currently, Google has no plans to charge for access to Bard when it becomes more widely available.

Finally, as we mentioned before, both models are constantly being upgraded. Bard, for example, just received an update last Friday that makes it better at math, and it may be able to code soon. OpenAI also continues to improve its GPT-4 model. Google currently retains its most powerful language model (probably due to computational cost), so we could see a stronger competitor Google catching up.

In short, the generative AI business is still in its early stages, and the situation is still uncertain. You and I are both dark horses!

The above is the detailed content of ChatGPT vs Google Bard: Which one is better? The test results will tell you!. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
GPT-4接入Office全家桶!Excel到PPT动嘴就能做,微软:重新发明生产力GPT-4接入Office全家桶!Excel到PPT动嘴就能做,微软:重新发明生产力Apr 12, 2023 pm 02:40 PM

一觉醒来,工作的方式被彻底改变。微软把AI神器GPT-4全面接入Office,这下ChatPPT、ChatWord、ChatExcel一家整整齐齐。CEO纳德拉在发布会上直接放话:今天,进入人机交互的新时代,重新发明生产力。​新功能名叫Microsoft 365 Copilot(副驾驶),与改变了程序员的代码助手GitHub Copilot成为一个系列,继续改变更多人。现在AI不光能自动做PPT,而且能根据Word文档的内容一键做出精美排版。甚至连上台时对着每一页PPT应该讲什么话,都给一起安排

集成GPT-4的Cursor让编写代码和聊天一样简单,用自然语言编写代码的新时代已来集成GPT-4的Cursor让编写代码和聊天一样简单,用自然语言编写代码的新时代已来Apr 04, 2023 pm 12:15 PM

集成GPT-4的Github Copilot X还在小范围内测中,而集成GPT-4的Cursor已公开发行。Cursor是一个集成GPT-4的IDE,可以用自然语言编写代码,让编写代码和聊天一样简单。 GPT-4和GPT-3.5在处理和编写代码的能力上差别还是很大的。官网的一份测试报告。前两个是GPT-4,一个采用文本输入,一个采用图像输入;第三个是GPT3.5,可以看出GPT-4的代码能力相较于GPT-3.5有较大能力的提升。集成GPT-4的Github Copilot X还在小范围内测中,而

GPT-4的两个谣言和最新预测!GPT-4的两个谣言和最新预测!Apr 11, 2023 pm 06:07 PM

​作者 | 云昭3月9日,微软德国CTO Andreas Braun在AI kickoff会议上带来了一个期待已久的消息:“我们将于下周推出GPT-4,届时我们将推出多模式模式,提供完全不同的可能性——例如视频。”言语之中,他将大型语言模型(LLM)比作“游戏改变者”,因为他们教机器理解自然语言,然后机器以统计的方式理解以前只能由人类阅读和理解的东西。与此同时,这项技术已经发展到“适用于所有语言”:你可以用德语提问,也可以用意大利语回答。借助多模态,微软(-OpenAI)将“使模型变得全面”。那

再一次改变“AI”世界 GPT-4千呼万唤始出来再一次改变“AI”世界 GPT-4千呼万唤始出来Apr 10, 2023 pm 02:40 PM

近段时间,人工智能聊天机器人ChatGPT刷爆网络,网友们争先恐后去领略它的超高情商和巨大威力。参加高考、修改代码、构思小说……它在广大网友的“鞭策”下不断突破自我,甚至可以用一整段程序,为你拼接出一只小狗。而这些技能只是基于GPT-3.5开发而来,在3月15日,AI世界再次更新,最新版本的GPT-4也被OpenAI发布了出来。与之前相比,GPT-4不仅展现了更加强大的语言理解能力,还能够处理图像内容,在考试中的得分甚至能超越90%的人类。那么,如此“逆天”的GPT-4还具有哪些能力?它又是如何

「数学天才」陶哲轩:GPT-4无法攻克一个未解决的数学问题,但对工作有帮助「数学天才」陶哲轩:GPT-4无法攻克一个未解决的数学问题,但对工作有帮助Apr 10, 2023 pm 02:21 PM

当红炸子鸡ChatGPT,也成为数学天才陶哲轩的研究工具了。近日,他在网上称自己发现了一些ChatGPT的小用例。首先,它很擅长解析代码格式的文档(在这种情况下是#arXiv搜索的API),然后返回一个正确格式的代码查询(后来它还提供了一些工作的python代码,以我要求的方式调用这个API,尽管我不得不手动安装一个包来使它运行)。其次,我让它想出一些,聪明的学生在本科线性代数课上可能会问的问题(为此我提供了一些样本题目),它给出了一些很好的例子,让我对课程可能方向,以及潜在的作业问题有所启发。

当GPT-4反思自己错了:性能提升近30%,编程能力提升21%当GPT-4反思自己错了:性能提升近30%,编程能力提升21%Apr 04, 2023 am 11:55 AM

GPT-4 的思考方式,越来越像人了。 人类在做错事时,会反思自己的行为,避免再次出错,如果让 GPT-4 这类大型语言模型也具备反思能力,性能不知道要提高多少了。众所周知,大型语言模型 (LLM) 在各种任务上已经表现出前所未有的性能。然而,这些 SOTA 方法通常需要对已定义的状态空间进行模型微调、策略优化等操作。由于缺乏高质量的训练数据、定义良好的状态空间,优化模型实现起来还是比较难的。此外,模型还不具备人类决策过程所固有的某些品质,特别是从错误中学习的能力。不过现在好了,在最近的一篇论文

微软 Bing Chat 聊天机器人已升级使用最新 OpenAI GPT-4 技术微软 Bing Chat 聊天机器人已升级使用最新 OpenAI GPT-4 技术Apr 12, 2023 pm 10:58 PM

3 月 15 日消息,今天 OpenAI 发布了全新的 GPT-4 大型语言模型,随后微软官方宣布,Bing Chat 此前已经升级使用 OpenAI 的 GPT-4 技术。微软公司副总裁兼消费者首席营销官 Yusuf Mehdi 确认 Bing Chat 聊天机器人 AI 已经在 GPT-4 上运行,ChatGPT 基于最新版本 GPT-4,由 OpenAI 开发 。微软 Bing 博客网站上的一篇帖子进一步证实了这一消息。微软表示,如果用户在过去五周内的任何时间使用过新的 Bing 预览版,

体验了首个接入GPT-4的代码编辑器,太炸裂了!体验了首个接入GPT-4的代码编辑器,太炸裂了!Apr 04, 2023 pm 02:35 PM

目前 Cursor 已经开源在 GitHub 上,已斩获了 9000+ GitHub Star,并成功登上 GitHub Trending。 最近一款名为Cursor的代码编辑器已经传遍了圈内,受到众多编程爱好者的追捧。它主打的亮点就是,通过 GPT-4 来辅助你编程,完成 AI 智能生成代码、修改 Bug、生成测试等操作。确实很吸引人,而且貌似也能大大节省人为的重复工作,让广大码农把有限的时间放在无限的需求构思上!目前 Cursor 已经开源在 GitHub 上,已斩获了 9000+ GitH

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Repo: How To Revive Teammates
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version