


In the past few days, the grand conference of AI - ICLR was held in Vienna.
OpenAI, Meta, Google, Zhipu AI and other world-leading AI technology companies gathered together.
The venue is crowded with celebrities and the stars are dazzling. If you walk a few steps, you can bump into a celebrity who has published a subversive paper.
Unsurprisingly, the ICLR 2024 exhibition hall has also become a star-chasing scene. The lively atmosphere almost blew the roof off.
On-site star chasing Turing giants
LeCun, the famous "e man" among the three Turing giants, was on X in advance He generously announced his schedule and was looking forward to meeting his fans.
In the comment area, not only were fans excited to check in, but some were even ready to submit their resumes on the spot.
The fans' trip was indeed worthwhile. At the scene, LeCun explained eloquently, and the enthusiastic audience formed a dense circle around him.
Getting back to the topic, throughout the ICLR event, the Meta team will share more than 25 papers and two seminars. This time, the LeCun team published the following two papers on ICLR.
Paper address: https://arxiv.org/abs/2305.19523
Paper address: https://arxiv.org/abs/2311.12983
Another Turing giant, Yoshua Bengio, also showed his high popularity.
The audience concluded: "A person really needs to be unique in his field to have such a long queue outside his conference room!"
LeCun and Hinton have both expressed strong opinions on this before. Bengio's attitude seems to have been relatively vague. I can't wait to know what he thinks of AGI. On the coming May 11, he will give a speech at a Workshop on AGI.
It is worth mentioning that the Bengio team also received an honorable mention for outstanding paper at this year’s ICLR.
##Paper address: https://openreview.net/pdf?id=Ouj6p4ca60
Next door to Google Meta, Zhipu AI is also present at, where Google’s open source model Gema, the framework behind robotic agents Robotics Transformers, and other groundbreaking research are presented.
Next to Meta and Google, there is a very eye-catching company in the middle of the exhibition hall - Zhipu AI.
The children’s shoes on site are introducing a series of research results such as GLM-4 and ChatGLM.
This series of displays attracted the attention of many foreign scholars.
The nearly two thousand guests and scholars at the scene listened carefully to the introduction of the GLM large model technical team.
The introduction includes a number of cutting-edge research results on the GLM series of large models, covering fields such as mathematics, Vincentian diagrams, image understanding, visual UI understanding, and Agent intelligence.
At the scene, everyone had a heated discussion about their views on Scaling Law. The GLM team also has unique insights into this -
"Compared to model size or training calculation amount, intelligent emergence and pre-training loss are more closely related."
For example, the famous OpenAI 996 researcher Jason Wei expressed his admiration after carefully reading the Zhipu AI paper on pre-training loss.
In the paper, the team evaluated its performance on 12 Chinese and English data sets by training 30 LLMs with different parameters and data sizes.
Paper address: https://arxiv.org/abs/2403.15796
Results observed , LLM will have emergent ability only when the pre-training loss is lower than a certain threshold.
Moreover, defining "emergent ability" from the perspective of pre-training loss is better than relying solely on model parameters or training volume.
The performance of Zhipu AI has also made more and more foreign netizens realize——
Tanishq, research director of Stability AI who received a PhD at the age of 19, said that the most competitive open source basic models such as CogVLM, which have made significant contributions to the open source ecosystem, come from China.
The former CEO of the game studio started using CogVLM and Stable Diffusion to make a complete open source version last year.
Yes, since CogVLM was released, its powerful capabilities have caused foreign netizens to exclaim.
In the LLM rankings in January this year, someone also discovered——
At that time, Gemini and GPT-4V were far ahead of any open source LLM, with the only exception being CogVLM.
It can be seen that with this wave of large-scale domestic models going overseas, Zhipu AI has quietly established its own huge influence abroad.
Special Speeches
In addition to the wonderful demonstrations in the exhibition hall, this year's ICLR invited a total of seven special speakers to share their insights on AI.
There are Raia Hadsell, a research scientist from Google DeepMind, Devi Parik, associate professor at Georgia Institute of Technology & Chief Scientist of FAIR, and director from the Max Planck Institute for Computer Science (MPI-SWS) Moritz Hardt, the only Chinese team is the GLM large model technical team of Zhipu AI.
Raia Hadsell
The title of Google DeepMind scientist Raia Hadsell's speech is - "Learning during the ups and downs of artificial intelligence development: Unexpected truths on the road to AGI ”.
After decades of steady development and occasional setbacks, AI is at a critical inflection point.
AI products have exploded into the mainstream market, and we have not yet reached the ceiling of scaling dividends, so the entire community is exploring the next step.
In this speech, based on more than 20 years of experience in the field of AI, Raia discussed our assumptions about the development path of AGI, how Change over time.
At the same time, she also revealed the unexpected discoveries we made during this exploration.
From reinforcement learning to distributed architecture to neural networks, they are already playing a potentially revolutionary role in the scientific field.
Raia believes that by learning from past experiences and lessons, important insights can be provided for the future research direction of AI.
Devi Parikh
#On the other side, FAIR chief scientist Devi Parik told everyone the story of her life.
As can be seen from the title of the speech, the content shared by Parik is extraordinary.
At the ICLR conference, when explaining why the technical environment is what it is now, everyone will focus on the development of the Internet, big data and computing power.
However, few people pay attention to those small, but important personal stories.
In fact, everyone’s story can be gathered into an important force to promote technological progress.
In this way, we can learn from each other and inspire each other. This makes us more tenacious and efficient in pursuing our goals.
Moritz Hardt
Moritz Hardt, Director of the German MPI-SWS, brought "Emerging Scientific Benchmarks" ” speech.
Obviously, benchmark testing has become the "core pillar" in the field of machine learning.
Since the 1980s, although humans have made many achievements under this research paradigm, their deep understanding is still limited.
#In this talk, Hardt explores the fundamentals of benchmarking as an emerging science through a series of selected empirical studies and theoretical analyses. principle.
He specifically discussed the impact of annotation errors on data quality, external validation of model rankings, and the prospects for multi-task benchmarking.
At the same time, Hard also presented a number of case studies.
These challenge our conventional wisdom and highlight the importance and benefits of developing scientific benchmarks.
GLM Team
In China, the GLM large model technical team of Zhipu AI has also brought "ChatGLM to AGI" "Road" wonderful speech.
It is worth mentioning that this is also the "first time" in China that a keynote speech related to large models has been presented at a top international conference.
This speech will first introduce the development process of AI in the past few decades from a Chinese perspective.
At the same time, they used ChatGLM as an example to explain the understanding and insights they gained during practice.
##2024 AGI Preview: GLM 4.5, GLM-OS, GLM-zero
At ICLR, the GLM large model team introduced the three major technical trends of GLM for AGI.
Where is the only way to AGI?
The industry has mixed opinions on this. Some people think it is an intelligent agent, some people think it is multi-modal, and some people say that Scaling Law is a necessary but not sufficient condition for AGI.
But LeCun insists that LLM is a wrong road to AGI, and LLM cannot bring AGI.
In this regard, the team also put forward its own unique point of view.
First of all, they talked about the subsequent upgraded version of GLM-4, namely GLM-4.5 and its upgraded model.
The subsequent upgraded version of GLM-4 will be based on SuperIntelligence and SuperAlignment technologies, while making great progress in the field of native multi-modality and AI security. .
The GLM large model team believes that text is the most critical foundation on the road to AGI.
The next step is to mix text, images, video, audio and other modalities together for training to become a true "native multi-modal model".
At the same time, in order to solve more complex problems, they also introduced the concept of GLM-OS, a general computing system centered on large models.
This view coincides with the view of large-model operating systems previously proposed by Karpathy.
At the ICLR site, the GLM large model team introduced the implementation of GLM-OS in detail:
Based on the existing All-Tools capabilities, coupled with memory and self-reflection capabilities, GLM-OS is expected to successfully imitate the human PDCA mechanism, namely Plan-Do-Check-Act cycle.
Specifically, make a plan first, then give it a try to form feedback, adjust the plan, and then take action in order to achieve better results.
Relying on the PDCA cycle mechanism, LLM can self-feedback and evolve independently - just like humans do.
In addition, the GLM large model team also revealed that since 2019, the team has been studying a technology called GLM-zero, aiming to study human "unconscious" learning mechanisms.
"When people are sleeping, the brain is still learning unconsciously."
The GLM large model team said that "unconscious" learning Mechanisms are an important part of human cognitive abilities, including self-learning, self-reflection, and self-criticism.
There are two systems in the human brain, "feedback" and "decision-making", which respectively correspond to the LLM large model and memory.
Therefore, related research on GLM-zero will further expand human understanding of consciousness, knowledge, and learning behavior.
Although it is still in a very early research stage, GLM-zero can be regarded as the only way to AGI.
This is also the first time that the GLM large model team has disclosed this technology trend to the outside world.
Domestic top technical team
At the end of 2020, the GLM large model technical team developed the GLM pre-training architecture.
In 2021, the tens of billions parameter model GLM-10B was trained, and in the same year, the converged trillions sparse model was successfully trained using the MoE architecture.
In 2022, they also collaborated to develop the Chinese-English bilingual 100-billion-level ultra-large-scale pre-training model GLM-130B and open sourced it.
In the past year, the team has completed an upgrade of the large base model almost every 3-4 months, and it has now been updated to the GLM-4 version.
Not only that, as the first domestic LLM company to enter the market, Zhipu AI has set an ambitious goal in 2023 - to benchmark OpenAI across the board.
The GLM large model technical team has built a complete large model product matrix based on the AGI vision.
In addition to the GLM series, there are also CogView grammatical model, CodeGeeX code model, multi-modal understanding model CogVLM, and then GLM-4V multi-modal large model and All-Tools Functions and AI assistant to clear words.
At the same time, the researchers of the GLM large model technology team have a very high influence in the industry.
For example, Li Feifei, who is very popular in the circle, teaches the CS25 course at Stanford University. Every time, she invites experts at the forefront of Transformer research to share her latest breakthroughs.
It has been confirmed that among the guests of the CS25 course, there are researchers from Zhipu AI.
CogVLM
The open source visual language model CogVLM developed by the team, once The release attracted industry attention.
A paper published by Stability AI in March showed that CogVLM was directly used by Stable Diffufion 3 for image annotation due to its excellent performance.
Paper address: https://arxiv.org/abs/2403.03206
CogAgent
On this basis, CogAgent, an open source visual language model improved based on CogVLM, is mainly aimed at the user graphical interface GUI. understand.
The relevant papers of CogAgent have been included in CVPR 2024, the highest-level academic conference in the field of international computer vision.
You must know that CVPR is known for its strict admissions. This year's thesis acceptance rate is only about 2.8%.
Paper address: https://arxiv.org/abs/2312.08914
ChatGLM-Math
#To solve mathematical problems with LLM, the GLM large model team proposed the "Self-Critique" iterative training method.
Through the self-feedback mechanism, it helps LLM improve both language and mathematics abilities.
Paper address: https://arxiv.org/abs/2404.02893
This method , including two key steps:
First train a "Math-Critique" model generated from LLM itself to evaluate the model to generate answers to mathematical questions and provide feedback signals.
Secondly, through rejection sampling fine-tuning and DPO, the new model is used to supervise the generation of LLM itself.
The GLM large model team also designed the MATHUSEREVAL benchmark test set to evaluate the mathematical capabilities of the new model. The results are as follows:
It is obvious that the new method significantly improves LLM’s mathematical problem-solving ability while still improving its language skills. Importantly, it outperforms larger models with twice the number of parameters in some cases.
GLM-4 ranks among the first echelons in the world
In the OpenCompass 2.0 benchmark test, the strength of Zhipu AI’s new generation base large model Not to be underestimated.
In the overall ranking, GLM-4 ranks third and ranks first in the country.
In the "SuperBench Large Model Comprehensive Capability Evaluation Report" released by the SuperBench team not long ago, GLM-4 also ranked among the first tier in the world.
Especially in the most critical semantic understanding and agent capabilities, GLM-4 ranks first in the country, overwhelming all competitors.
In the first year of big models that has just passed, a lively model war has been going on for a year.
If 2024 is to be the first year of AGI, the world’s largest model teams still have a long way to go.
The above is the detailed content of The Turing giant appeared at ICLR and went crazy for stars LeCun and Bengio at the summit! Three major technology trends of Chinese teams set off new imagination of AGI. For more information, please follow other related articles on the PHP Chinese website!

The term "AI-ready workforce" is frequently used, but what does it truly mean in the supply chain industry? According to Abe Eshkenazi, CEO of the Association for Supply Chain Management (ASCM), it signifies professionals capable of critic

The decentralized AI revolution is quietly gaining momentum. This Friday in Austin, Texas, the Bittensor Endgame Summit marks a pivotal moment, transitioning decentralized AI (DeAI) from theory to practical application. Unlike the glitzy commercial

Enterprise AI faces data integration challenges The application of enterprise AI faces a major challenge: building systems that can maintain accuracy and practicality by continuously learning business data. NeMo microservices solve this problem by creating what Nvidia describes as "data flywheel", allowing AI systems to remain relevant through continuous exposure to enterprise information and user interaction. This newly launched toolkit contains five key microservices: NeMo Customizer handles fine-tuning of large language models with higher training throughput. NeMo Evaluator provides simplified evaluation of AI models for custom benchmarks. NeMo Guardrails implements security controls to maintain compliance and appropriateness

AI: The Future of Art and Design Artificial intelligence (AI) is changing the field of art and design in unprecedented ways, and its impact is no longer limited to amateurs, but more profoundly affecting professionals. Artwork and design schemes generated by AI are rapidly replacing traditional material images and designers in many transactional design activities such as advertising, social media image generation and web design. However, professional artists and designers also find the practical value of AI. They use AI as an auxiliary tool to explore new aesthetic possibilities, blend different styles, and create novel visual effects. AI helps artists and designers automate repetitive tasks, propose different design elements and provide creative input. AI supports style transfer, which is to apply a style of image

Zoom, initially known for its video conferencing platform, is leading a workplace revolution with its innovative use of agentic AI. A recent conversation with Zoom's CTO, XD Huang, revealed the company's ambitious vision. Defining Agentic AI Huang d

Will AI revolutionize education? This question is prompting serious reflection among educators and stakeholders. The integration of AI into education presents both opportunities and challenges. As Matthew Lynch of The Tech Edvocate notes, universit

The development of scientific research and technology in the United States may face challenges, perhaps due to budget cuts. According to Nature, the number of American scientists applying for overseas jobs increased by 32% from January to March 2025 compared with the same period in 2024. A previous poll showed that 75% of the researchers surveyed were considering searching for jobs in Europe and Canada. Hundreds of NIH and NSF grants have been terminated in the past few months, with NIH’s new grants down by about $2.3 billion this year, a drop of nearly one-third. The leaked budget proposal shows that the Trump administration is considering sharply cutting budgets for scientific institutions, with a possible reduction of up to 50%. The turmoil in the field of basic research has also affected one of the major advantages of the United States: attracting overseas talents. 35

OpenAI unveils the powerful GPT-4.1 series: a family of three advanced language models designed for real-world applications. This significant leap forward offers faster response times, enhanced comprehension, and drastically reduced costs compared t


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

SublimeText3 Chinese version
Chinese version, very easy to use

Notepad++7.3.1
Easy-to-use and free code editor
