Home > Article > Technology peripherals > Intelligent data annotation solution: a crowdsourcing platform that welcomes the era of large models
On May 26, NetEase’s Fuxi Youling crowdsourcing platform made its debut at the China International Big Data Industry Expo. This platform is a human-computer collaboration online task platform developed by NetEase Fuxi based on its own research and development. It is currently the only crowdsourcing platform on the market that supports real-time human-computer interaction annotation. The goal of the Fuxi Youling crowdsourcing platform is to solve the labor shortage problem in all walks of life and provide the entire society with more convenient and interesting online employment opportunities. Enterprise customers can quickly model and publish tasks through this platform, while each gig user can freely receive tasks without restrictions on time and geography. In this way, the Fuxi Youling crowdsourcing platform provides enterprises and individuals with a more efficient and flexible working model.
In today's era, artificial intelligence technology is rapidly changing the way humans live and work. With the rapid development of artificial intelligence technologies such as large language models and multi-modal large models, the field of data annotation has ushered in a new era of vigorous development. A large amount of data is constantly emerging in various fields. However, in this exciting era, both the demand side and the provider side are facing huge challenges. They need to find an efficient way to provide high-quality, low-cost data support. This is not only related to the accuracy and practicality of artificial intelligence technology, but also to the development prospects of the entire industry. Therefore, the data annotation industry needs continuous innovation and improvement to meet the needs of artificial intelligence technology and promote the sustainable development of the industry.
In order to adapt to the trend of the big data era, many artificial intelligence companies have begun to establish training and management systems for data trainers, and continue to carry out technological innovation and improve data quality. However, as labor costs rise, more and more organizations are looking for more efficient and economical ways to annotate data. NetEase Fuxi Youling crowdsourcing platform came into being, based on the idea of HITL (Human-in-the-Loop).
At this Data Expo, Fuxi Youling Crowdsourcing Platform It demonstrates its unique capabilities and advantages: combining human intelligence and decision-making power with the computing power of machine learning to achieve high-quality data annotation. Through a detailed and rigorous annotation process and a scientific scoring system, the platform maintains the accuracy and reliability of the data. At the same time, Fuxi Youling has also adopted a series of cutting-edge technical measures, including reducing costs, shortening the annotation cycle and ensuring data quality, to improve efficiency and effectiveness.
Data closed loop
After the annotator completes the data annotation, the platform provides support for real-time backflow model training, and the task issuer can evaluate the effect of the model before and after training. Compare and feel how the data annotation results improve the model and automatically update the model. The updated model can assist subsequent data annotation tasks and further improve the quality and efficiency of data annotation.
Full data inspection
The platform supports automatic quality inspection of all task data. The task issuer can flexibly configure the quality inspection process. The platform will combine users with Historical task levels and user portraits are used to conduct task quality inspection. At the same time, models are introduced to participate in quality inspection, so that AI and people can participate in quality control at the same time, and ultimately achieve high-accuracy delivery of tasks.
User Portraits
The platform has a complete user portrait and task matching mechanism, based on the user’s past task performance and combined with the user’s personal label data. Achieve matching according to the diverse needs of different task types, and assign tasks to the best people to do it, so as to meet the quality, efficiency and cost requirements of data annotation tasks.
Swarm Intelligence
The platform will locate diversified annotators based on user portraits, introduce redundant annotation forms, and use interval estimation and true Algorithmic methods such as value inference enable them to jointly participate in labeling decisions and obtain the final labeling results, ensuring the objectivity and accuracy of the final results.
According to the person in charge of the platform: The current platform mainly focuses on cognitive work content, which comes from the collection and labeling needs of multi-modal data such as text, pictures, and speech by AIGC and other artificial intelligence technologies. With the widespread application of communication technologies such as 5G, the platform will undertake more decision-making tasks such as remote control in the future. Based on digital twin technology, offline work will be digitized and online, allowing users to complete tasks in a gamified digital twin environment. happy working.
NetEase Fuxi Youling platform uses AI technology and manual annotation to ensure the quality and accuracy of data annotation and improve data annotation efficiency. It not only provides reliable and efficient data services for enterprises, but also contributes to the vigorous development of AI technology.
During the same period of the exhibition, Dr. Wu Runze of NetEase Fuxi Lab also focused on "NetEase Fuxi Data" The theme of "Application Practice of Crowdsourcing Empowering Large Models" was shared.
Dr. Wu said: NetEase Fuxi has been deeply involved in large model technology since 2019, taking text pre-training and multi-modal pre-training as the main entry points, relying on the data crowdsourcing platform to provide high-quality data feedback closed loop, and overcome For key technical challenges such as unified representation construction, distributed object storage, and large-scale vector engines, it was selected as the "Pioneer Project" of Zhejiang Province and received official recognition for funding. It has successfully incubated two major game vertical products such as Danqingyue Art Platform and Game Intelligent NPC.
Currently, the Fuxi Youling crowdsourcing platform has been applied in multiple products and scenarios within NetEase Group: In the open world of the "Nishuihan" mobile game, the emotions are delicate and the reactions are Smart NPCs with sensitive, realistic movements and rich expressions are deeply loved by players. Smart NPCs require massive amounts of high-quality Human Feedback data to support them.
NetEase Fuxi Youling Crowdsourcing provides multi-data services involving voice collection, text annotation, emotional judgment, image annotation and other data services for the intelligent NPC model in the game, and ultimately supports the creation of text, voice , facial expressions and other multi-dimensional intelligent game NPCs. This is the deep integration that NetEase has accumulated in the fields of game engines and AI to solve the closed-loop problem of large-scale computing power data and pre-training models.
At present, NetEase Fuxi Youling crowdsourcing platform has processed hundreds of millions of data. While ensuring the performance of game AI, it can more efficiently collect feedback from game players and further improve AI performance. , thereby applying the technology in more diverse scenarios. Based on the concepts of openness, cooperation, and win-win, NetEase Fuxi will invite partners from upstream and downstream of the industry chain to jointly create a new era of AI digitalization.
The above is the detailed content of Intelligent data annotation solution: a crowdsourcing platform that welcomes the era of large models. For more information, please follow other related articles on the PHP Chinese website!