Home >Technology peripherals >AI >The first in the country! SenseTime releases 'Ririxin 5o', real-time multi-modal streaming interaction benchmarking GPT-4o
July 5, 2024, Shanghai - SenseTime, a strategic partner of the 2024 World Artificial Intelligence Conference and High-Level Conference on Artificial Intelligence Global Governance (WAIC 2024), held the "Love Without Boundaries·Xiang Xinli" Artificial Intelligence Forum and released the first domestic A WYSIWYG model is "new every day 5o", and the interactive experience is benchmarked against GPT-4o, realizing a new AI interaction model. By integrating cross-modal information, based on various forms such as sound, text, image and video, the country's first WYSIWYG model "Ririxin 5o" brings a new AI interaction model, that is, real-time streaming multi- Modal interaction. This innovative interaction model was also demonstrated to everyone at the scene - the staff just said hello to "RiRiXin5o" at first, and it automatically recognized the words on the badge strap worn by the staff and judged that the scene was The venue of the World Artificial Intelligence Conference, and said that one can "study well" in this place.
Then the staff brought a cute puppy doll, "RiRiXin5o" accurately described the puppy's appearance, expression and important wear - one wearing a white hat with the SenseTime logo printed on it, very cute The home crowd lined up. More difficult, just open any page of a book, "RiRiXin5o" can automatically introduce it. It is not a simple OCR recognition of text, but recognition of pictures and texts to give an easy-to-understand summary. This Everything can be completed in an instant, truly achieving real-time interaction. The staff also showed their "drawing skills" on the spot and drew a simple little bunny. "RiRiXin5o" said it was cute, and then the staff drew a smiling expression. It calmed down from this The smile was caught in the expression, and the staff made another change to make the mouth larger and add a tongue. After seeing it, "RiRiXin5o" immediately said that this expression was much happier.You can listen, read and find topics, just like a real person chatting. This interaction mode is especially suitable for applications such as real-time dialogue and speech recognition. It has strong multi-task adaptability and can naturally handle multiple tasks in the same model. tasks, and adaptively adjusts behavior and output according to different contexts. The ability to achieve an interactive experience that is comparable to GPT-4o is due to the comprehensive improvement of the capabilities of the "RiRiXin 5.5" basic model.
The "RiRiXin 5.0" released in April this year is the first domestic large-scale model to benchmark GPT-4 Turbo. In just over two months, the new "RiRiXin 5.5" system has received many upgrades. The comprehensive performance is improved by an average of 30% compared with "Ririxin 5.0", and the mathematical reasoning, English ability and command following abilities are significantly enhanced. The interactive effect and multiple core indicators have achieved the benchmark GPT-4o.
"Ririxin 5.5" adopts a hybrid device-cloud collaboration expert architecture to maximize cloud-edge-device collaboration and reduce reasoning costs. Model training is based on more than 10TB tokens of high-quality training data, including a large amount of synthetic thinking chain data, to improve reasoning. thinking ability. In order to allow more enterprise users to access and use the powerful capabilities of the "RiRiXin" large model system at a low threshold, SenseTime recently launched the "Large Model 0 Yuan Go" plan.The above is the detailed content of The first in the country! SenseTime releases 'Ririxin 5o', real-time multi-modal streaming interaction benchmarking GPT-4o. For more information, please follow other related articles on the PHP Chinese website!