Recently, everyone in the AI field is thinking about one thing: What should be the best way to implement large multi-modal models in the industry? The era of general artificial intelligence is coming. AI large model technology has become an important support for infrastructure construction in the digital economy and has also become the core "engine" for industrial intelligent transformation. The industrial application of AI large models has ushered in unprecedented development opportunities. At the CNCC 2023 "Super Intelligence Fusion AI Large Model Application Implementation Development Forum" held on October 28, Sophon Engine released "Yuancheng Xiang Chatimg3.0" , showing the latest progress and implementation exploration of the multi-modal general generation model "Yuancheng Xiang Chatimg3.0". Upgrade and iteration of core technology of Chatimg3.0Yuancheng Xiang Chatimg3.0 It is a large multi-modal model with ultra-fine recognition and less hallucinations. It also supports multi-image understanding, object positioning, OCR and other functions. Chatimg3.0 equips hardware devices with brains, enabling more natural and smooth human-machine communication, laying a solid foundation for AI multi-modal large model-empowering industrial applications. Compared with Chatimg2.0, Chatimg3.0 has mainly been upgraded in two aspects, including the first stage of pre-training (description, detection, OCR and other multi-task training ) and the second stage of instruction fine-tuning (high-quality manual fine-screening instruction set). In order to better evaluate the capabilities of multi-modal large models, Sophon Engine has constructed a new multi-modal dialogue test set, from description, reasoning, detection, question and answer The model capabilities were evaluated in five aspects: Q&A and business, and it caught up with GPT-4V in terms of Q&A and business capabilities, showing the excellent development potential of domestic large models.
The following is the specific performance of Chatimg3.0 compared to GPT-4V in the test:
# left: Chatimg3.0, right: GPT-4V. ## Reasoning:
## Inference:
Left: Chatimg3.0, Right: GPT-4V.
Testing:
# This left: Chatimg3.0, right: GPT-4V. # 问 答:
left: Chatimg3.0, Right: GPT-4V.
##Exploration and application in key areasCurrently, Sophon Engine has applied "Yuancheng Xiang Chatimg3.0" to fields such as global prevention and control and drone inspections. Through the integration with front-end sensing equipment such as drones and electronic probes, it has upgraded traditional inspection and Security realizes AI defect identification, anomaly detection, behavior analysis, key monitoring, autonomous inspection, risk prediction and other functions, and promotes the AI engineering innovation process.
As the first multi-modal large model R&D team in China, Sophon Engine not only has innate advantages in talent and technology, but angel investments from multiple well-known investment institutions and IT industry leaders have also made the company "even more powerful". With the collaborative assistance of iSoftStone, ChinaSoft Technology and other well-known enterprises, the "Sophon·Tianqiong" and "Sophon" developed by "Yuancheng Xiang Chatimg3.0" "Skyscanner" system has attracted the attention of the industry as soon as it was launched. The product has been quickly applied to urban governance, smart power, pipeline inspection, park management, agriculture, finance and other industry application scenarios, and has gradually begun pilot deployment.
In the future, in order to accelerate the implementation of large model industry applications and promote the sustainable development of the digital economy, Sophon Engine will continue to strengthen model training and capability upgrades, and gather top talents and advantageous resources in the industry , sparing no effort to support industrial upgrading with large models. The core model "Yuancheng Xiang Chatimg" will continue to make efforts in AI agents, embodied intelligence and other directions in the future, and will gain more industry attention. The above is the detailed content of Yuanchengxiang Chatimg3.0: A new strategy for industrial upgrading beyond GPT-4V. For more information, please follow other related articles on the PHP Chinese website!