


2B parameter performance exceeds Mistral-7B: wall-facing intelligent multi-modal end-side model open source
Qianyuan Machine can also be run locally.
#Recently, people have achieved results in optimization and deployment, with the development of large models towards large volumes.
On February 1st, Wall-Facing Intelligence and Tsinghua NLP Laboratory officially released the flagship end-to-side large model "Wall-Facing MiniCPM" in Beijing. This new generation of large models is known as the "performance small steel cannon". It can not only be deployed directly on the terminal, but also has the strongest multi-modal capabilities at the same level. This will provide users with a faster and more efficient smart application experience.
The latest MiniCPM 2B model launched by Face Wall Intelligence has only 2 billion parameters and is trained by using selected data of 1T token. Compared with the BERT model released in 2018, this model has the same number of parameters, but Wall-Facing Intelligence has made extreme efforts in performance optimization and cost control, allowing this model to achieve the effect of "leapfrogging and killing monsters" in terms of performance. .
Li Dahai, co-founder and CEO of Face Wall Intelligence, compared the new model with Mistral-7B, a well-known open source large model in the industry. MiniCPM 2B surpassed the latter in terms of performance on multiple mainstream evaluation lists.

Compared with the "small model" Phi-2 recently proposed by Microsoft, MiniCPM also has great advantages.
Li Dahai pointed out that the new model of wall-facing intelligence has the potential to achieve leapfrog implementation in terms of capabilities, and can realize the capabilities of 13B, 30B or even 40B models. When evaluated using MT-Bench, the evaluation list closest to user experience, MiniCPM scored 7 points (in comparison, GPT-4-Turbo scored 9 points).
At the scene, Wall-Facing Intelligence also demonstrated the practical application effect of MiniCPM. Although the number of parameters is small, the model has many capabilities such as text translation and role playing that a large model should have, and it has rich knowledge. The model can handle even difficult code interpretation tasks.
Because it can be deployed on the terminal side, MiniCPM can also provide people with timely help when facing some emergencies:

Recently, various mobile phone manufacturers have proposed large end-side models. After compressing the large language model into a smaller size, we can use it to connect to more scenarios, even when computing power and memory are limited. obtain a higher degree of intelligence. In contrast, the new technology proposed by Wall-Facing Intelligence is lighter and can be applied to lower configuration or earlier model mobile phones.
According to Mianbi Intelligence, the MiniCPM end-side model has undergone Int4 quantization and has been compressed by 75% in size, occupying only 2G of memory. At the same time, there is almost no loss in performance, so it has been used on various common models of mobile phones. Achieved run-through.
Because it supports mobile CPU inference, MiniCPM can save usage costs to a great extent. Face Wall Intelligence has calculated an account for us: a mobile phone equipped with Snapdragon 855 using MiniCPM can process 1.7 million tokens for one dollar of electricity. This price is only 1% of Mistral-Medium running in the cloud.
In addition to end-side models, Wall-Facing Intelligence also demonstrated its exploration of multi-modal large models and open sourced the 12B parameter OmniLMM. At the press conference, Facewall Intelligence demonstrated the same rock-paper-scissors demo when Gemini was released. Ask the AI in English: What game am I playing? The big model would answer: rock, paper, scissors.
At the same time, OmniLMM can also recognize human gestures and tell you what to play if you want to win.
OmniLMM can also understand and reason about information in many pictures, such as landmark buildings, TV station logos, activities organized by people, etc.
#It seems that we are not far away from truly multi-modal large models and the application of new forms.
The ultimate performance of the wall-facing intelligent large model stems from the company’s long-term technology accumulation. Since 2021, Wallface Intelligence has built an efficient technology stack, focusing on the three directions of Infra, algorithms and data methodology. Among them, the self-developed BMTrain efficient training framework is crucial.
At the algorithm level, Wall-Facing Intelligence has also accumulated a model sandbox system, elevating large models from alchemy to the level of experimental science, and constantly looking for hyperparameters and The optimal solution of scale, such as the optimal batch size and the common hyperparameter configuration for all size models.
Currently, Wall-Facing Intelligence has accumulated a large amount of high-quality data. After yesterday’s release, Face Wall Intelligence open sourced its new generation large model series (including MiniCPM-SFT / DPOMiniCPM-V & MiniCPM-SFT / DPO-int4), as well as the data recipes for the two stages of training MiniCPM for industry reference.
Open source address (including technical report):
MiniCPM GitHub: https://github.com/OpenBMB/MiniCPM
OmniLMM GitHub: https://github.com /OpenBMB/OmniLMM
Wall-Facing Intelligence originated from Tsinghua NLP Laboratory. It is one of the earliest teams to carry out large model research in China. In 2018, it released the world's first pre-training model ERNIE based on knowledge guidance. . Face Wall Intelligence, which began corporate operations in August 2022, experienced two rounds of financing last year, and its application "Mian Wall Luka" also received the second batch of large model registrations from the Cyberspace Administration of China.
Currently, Wall-Facing Intelligence has established a scientific research team of more than 100 people, 80% of whom are from Qingbei, with an average age of 28 years old.
Wall-face Intelligence is building a dual-engine strategy for large model Agents, hoping to build smaller-scale, faster, and lower-cost solutions.
This year, Wall-Facing Intelligence will also accelerate the iteration of new technologies. "We will continue to release new versions of MiniCPM after the Spring Festival, and the performance will be further improved. We want to give everyone a break during the Spring Festival," Liu Zhiyuan said.
The above is the detailed content of 2B parameter performance exceeds Mistral-7B: wall-facing intelligent multi-modal end-side model open source. For more information, please follow other related articles on the PHP Chinese website!

ChatGPT Security Enhanced: Two-Stage Authentication (2FA) Configuration Guide Two-factor authentication (2FA) is required as a security measure for online platforms. This article will explain in an easy-to-understand manner the 2FA setup procedure and its importance in ChatGPT. This is a guide for those who want to use ChatGPT safely. Click here for OpenAI's latest AI agent, OpenAI Deep Research ⬇️ [ChatGPT] What is OpenAI Deep Research? A thorough explanation of how to use it and the fee structure! table of contents ChatG
![[For businesses] ChatGPT training | A thorough introduction to 8 free training options, subsidies, and examples!](https://img.php.cn/upload/article/001/242/473/174704251871181.jpg?x-oss-process=image/resize,p_40)
The use of generated AI is attracting attention as the key to improving business efficiency and creating new businesses. In particular, OpenAI's ChatGPT has been adopted by many companies due to its versatility and accuracy. However, the shortage of personnel who can effectively utilize ChatGPT is a major challenge in implementing it. In this article, we will explain the necessity and effectiveness of "ChatGPT training" to ensure successful use of ChatGPT in companies. We will introduce a wide range of topics, from the basics of ChatGPT to business use, specific training programs, and how to choose them. ChatGPT training improves employee skills

Improved efficiency and quality in social media operations are essential. Particularly on platforms where real-time is important, such as Twitter, requires continuous delivery of timely and engaging content. In this article, we will explain how to operate Twitter using ChatGPT from OpenAI, an AI with advanced natural language processing capabilities. By using ChatGPT, you can not only improve your real-time response capabilities and improve the efficiency of content creation, but you can also develop marketing strategies that are in line with trends. Furthermore, precautions for use
![[For Mac] Explaining how to get started and how to use the ChatGPT desktop app!](https://img.php.cn/upload/article/001/242/473/174704239752855.jpg?x-oss-process=image/resize,p_40)
ChatGPT Mac desktop app thorough guide: from installation to audio functions Finally, ChatGPT's desktop app for Mac is now available! In this article, we will thoroughly explain everything from installation methods to useful features and future update information. Use the functions unique to desktop apps, such as shortcut keys, image recognition, and voice modes, to dramatically improve your business efficiency! Installing the ChatGPT Mac version of the desktop app Access from a browser: First, access ChatGPT in your browser.

When using ChatGPT, have you ever had experiences such as, "The output stopped halfway through" or "Even though I specified the number of characters, it didn't output properly"? This model is very groundbreaking and not only allows for natural conversations, but also allows for email creation, summary papers, and even generate creative sentences such as novels. However, one of the weaknesses of ChatGPT is that if the text is too long, input and output will not work properly. OpenAI's latest AI agent, "OpenAI Deep Research"

ChatGPT is an innovative AI chatbot developed by OpenAI. It not only has text input, but also features voice input and voice conversation functions, allowing for more natural communication. In this article, we will explain how to set up and use the voice input and voice conversation functions of ChatGPT. Even when you can't take your hands off, ChatGPT responds and responds with audio just by talking to you, which brings great benefits in a variety of situations, such as busy business situations and English conversation practice. A detailed explanation of how to set up the smartphone app and PC, as well as how to use each.

The shortcut to success! Effective job change strategies using ChatGPT In today's intensifying job change market, effective information gathering and thorough preparation are key to success. Advanced language models like ChatGPT are powerful weapons for job seekers. In this article, we will explain how to effectively utilize ChatGPT to improve your job hunting efficiency, from self-analysis to application documents and interview preparation. Save time and learn techniques to showcase your strengths to the fullest, and help you make your job search a success. table of contents Examples of job hunting using ChatGPT Efficiency in self-analysis: Chat

Mind maps are useful tools for organizing information and coming up with ideas, but creating them can take time. Using ChatGPT can greatly streamline this process. This article will explain in detail how to easily create mind maps using ChatGPT. Furthermore, through actual examples of creation, we will introduce how to use mind maps on various themes. Learn how to effectively organize and visualize your ideas and information using ChatGPT. OpenAI's latest AI agent, OpenA


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 English version
Recommended: Win version, supports code prompts!

Zend Studio 13.0.1
Powerful PHP integrated development environment

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

VSCode Windows 64-bit Download
A free and powerful IDE editor launched by Microsoft
