Home > Article > Technology peripherals > 2B parameter performance exceeds Mistral-7B: wall-facing intelligent multi-modal end-side model open source
Qianyuan Machine can also be run locally.
#Recently, people have achieved results in optimization and deployment, with the development of large models towards large volumes.
On February 1st, Wall-Facing Intelligence and Tsinghua NLP Laboratory officially released the flagship end-to-side large model "Wall-Facing MiniCPM" in Beijing. This new generation of large models is known as the "performance small steel cannon". It can not only be deployed directly on the terminal, but also has the strongest multi-modal capabilities at the same level. This will provide users with a faster and more efficient smart application experience.
The latest MiniCPM 2B model launched by Face Wall Intelligence has only 2 billion parameters and is trained by using selected data of 1T token. Compared with the BERT model released in 2018, this model has the same number of parameters, but Wall-Facing Intelligence has made extreme efforts in performance optimization and cost control, allowing this model to achieve the effect of "leapfrogging and killing monsters" in terms of performance. .
Li Dahai, co-founder and CEO of Face Wall Intelligence, compared the new model with Mistral-7B, a well-known open source large model in the industry. MiniCPM 2B surpassed the latter in terms of performance on multiple mainstream evaluation lists.
Compared with the "small model" Phi-2 recently proposed by Microsoft, MiniCPM also has great advantages.
Li Dahai pointed out that the new model of wall-facing intelligence has the potential to achieve leapfrog implementation in terms of capabilities, and can realize the capabilities of 13B, 30B or even 40B models. When evaluated using MT-Bench, the evaluation list closest to user experience, MiniCPM scored 7 points (in comparison, GPT-4-Turbo scored 9 points).
At the scene, Wall-Facing Intelligence also demonstrated the practical application effect of MiniCPM. Although the number of parameters is small, the model has many capabilities such as text translation and role playing that a large model should have, and it has rich knowledge. The model can handle even difficult code interpretation tasks.
Because it can be deployed on the terminal side, MiniCPM can also provide people with timely help when facing some emergencies:
Recently, various mobile phone manufacturers have proposed large end-side models. After compressing the large language model into a smaller size, we can use it to connect to more scenarios, even when computing power and memory are limited. obtain a higher degree of intelligence. In contrast, the new technology proposed by Wall-Facing Intelligence is lighter and can be applied to lower configuration or earlier model mobile phones.
According to Mianbi Intelligence, the MiniCPM end-side model has undergone Int4 quantization and has been compressed by 75% in size, occupying only 2G of memory. At the same time, there is almost no loss in performance, so it has been used on various common models of mobile phones. Achieved run-through.
Because it supports mobile CPU inference, MiniCPM can save usage costs to a great extent. Face Wall Intelligence has calculated an account for us: a mobile phone equipped with Snapdragon 855 using MiniCPM can process 1.7 million tokens for one dollar of electricity. This price is only 1% of Mistral-Medium running in the cloud.
In addition to end-side models, Wall-Facing Intelligence also demonstrated its exploration of multi-modal large models and open sourced the 12B parameter OmniLMM. At the press conference, Facewall Intelligence demonstrated the same rock-paper-scissors demo when Gemini was released. Ask the AI in English: What game am I playing? The big model would answer: rock, paper, scissors.
At the same time, OmniLMM can also recognize human gestures and tell you what to play if you want to win.
OmniLMM can also understand and reason about information in many pictures, such as landmark buildings, TV station logos, activities organized by people, etc.
#It seems that we are not far away from truly multi-modal large models and the application of new forms.
The ultimate performance of the wall-facing intelligent large model stems from the company’s long-term technology accumulation. Since 2021, Wallface Intelligence has built an efficient technology stack, focusing on the three directions of Infra, algorithms and data methodology. Among them, the self-developed BMTrain efficient training framework is crucial.
At the algorithm level, Wall-Facing Intelligence has also accumulated a model sandbox system, elevating large models from alchemy to the level of experimental science, and constantly looking for hyperparameters and The optimal solution of scale, such as the optimal batch size and the common hyperparameter configuration for all size models.
Currently, Wall-Facing Intelligence has accumulated a large amount of high-quality data. After yesterday’s release, Face Wall Intelligence open sourced its new generation large model series (including MiniCPM-SFT / DPOMiniCPM-V & MiniCPM-SFT / DPO-int4), as well as the data recipes for the two stages of training MiniCPM for industry reference.
Open source address (including technical report):
MiniCPM GitHub: https://github.com/OpenBMB/MiniCPM
OmniLMM GitHub: https://github.com /OpenBMB/OmniLMM
Wall-Facing Intelligence originated from Tsinghua NLP Laboratory. It is one of the earliest teams to carry out large model research in China. In 2018, it released the world's first pre-training model ERNIE based on knowledge guidance. . Face Wall Intelligence, which began corporate operations in August 2022, experienced two rounds of financing last year, and its application "Mian Wall Luka" also received the second batch of large model registrations from the Cyberspace Administration of China.
Currently, Wall-Facing Intelligence has established a scientific research team of more than 100 people, 80% of whom are from Qingbei, with an average age of 28 years old.
Wall-face Intelligence is building a dual-engine strategy for large model Agents, hoping to build smaller-scale, faster, and lower-cost solutions.
This year, Wall-Facing Intelligence will also accelerate the iteration of new technologies. "We will continue to release new versions of MiniCPM after the Spring Festival, and the performance will be further improved. We want to give everyone a break during the Spring Festival," Liu Zhiyuan said.
The above is the detailed content of 2B parameter performance exceeds Mistral-7B: wall-facing intelligent multi-modal end-side model open source. For more information, please follow other related articles on the PHP Chinese website!