Home >Technology peripherals >AI >Apple's headset is coming, is it the savior of AI virtual humans or a meteor?
Author|Su Xiaoru
AIGC is so popular, has the Metaverse been forgotten?
With the release of Apple’s new AR glasses Vision Pro priced at 24,000 yuan, AR and VR, which have been dormant for some time, have once again returned to the center of everyone’s attention.
Who is the first to prepare for AR and VR, and the first to get excited again? Of course, it is a series of companies that make virtual people and digital people. Virtual human technology is a technology that is very closely integrated with entertainment. Under the earlier concept of the metaverse and the current AIGC era, virtual humans have many places where they can be integrated with hot spots.
Combined with the AIGC concept, it is the biggest theme in the industry in the first half of this year. At the beginning of this month, the digital clone of the Internet celebrity "Hanzo Forest" was launched. Users can enjoy voice, call and other services on a monthly or yearly basis. Since AI replaces human drivers, digital humans can work 24/7, greatly increasing production capacity.
The live broadcast process mentioned here is driven by the performance of real people, that is, the "people in the middle", which is one of the "traditional genres" of virtual people. With the continuous evolution of artificial intelligence, which is completely driven by programs or AI, providing question and answer on an adaptable basis has become another major development direction of the virtual human industry.
The technical route for realizing AI virtual humans is currently divided into two types: one is to privatize and deploy large models, and the other is to conduct targeted secondary development based on existing open source models.
Table 1 Type classification of the three virtual human companies interviewed in this article Tabulation/Entertainment Capitalism
In this article, Entertainment Capitalism will take you into three typical virtual human technology developers to explore the changes that AI has brought to their product development and business models.
AIGC virtual content platform Yunbo uses the "Little K Live Broadcaster" for live broadcasters as a carrier to explore and build a deep live broadcast ecosystem in a gamified way;
Zhongke Shenzhi, which focuses on end-to-end generative AI virtual human technology, focuses on "automatic broadcasting" of goods for merchants. It has just released its own large model to significantly improve the effect of virtual human question and answer interaction;
Cross-modal intelligent software service company Mejike not only provides fully automatic virtual anchors for enterprises, but also uses the enterprise's internal data to train proprietary models and cultivate the enterprise's internal database and search engine.
Many anchors and viewers of Station B’s live broadcast will never be unfamiliar with Xiao K Live Broadcast Ji. Xiao K Live Ji uses an RGB camera to collect 2D picture action data and uses algorithms to generate 3D action data, forming a technical moat. Users can directly use Xiao K Live Broadcast Ji to create their own live broadcast virtual person.
"It took us 3 years to build the underlying algorithm, have a private training data set, and the product has achieved initial results."
In 2017, Mei Song resigned from his job as a producer of Linekong Interactive Game, joined the artificial intelligence industry, and founded Yunbo Technology, the developer of Xiao K Live Ji. The dual background of the gaming and AI industries also makes Yunbo’s business model very unique.
AI mapping by Entertainment Capital
"Our company's business scope covers both ToB and ToC. Products include Xiao K AI motion capture, AI drawing, virtual human engine, etc. We also have a self-operated MCN guild and anchor base. In addition, the company also faces live broadcasts and games , the three core scenarios of e-commerce, providing virtual people and virtual content services for enterprises."
Zhongke GenSense, which focuses on end-to-end generative AI virtual human technology, officially released its own large model on May 10, called "GenSense Digital Intelligence Jiang Shang".
"After Open AI announced the training method, the difficulty of entering the large model has become lower. Whether an enterprise wants to build its own large model mainly depends on the commercial space of the large model." The founder and CEO of Zhongke Shenzhi Cheng Wei Zhong said. "At the beginning of 2020, we started doing cross-modal training work based on transformer, and we also have experience in data cleaning."
"We mainly focus on two aspects to build large models. First, future multi-modal training will be based on large language models, and the underlying algorithms in vertical fields need to rely on large models. Second, many customers have proposed privatized deployment. This The computing power is required to be reduced. If other large models are connected, we will not be able to get the source code and data sets."
The business scope of Zhongke Shenzhi, in addition to virtual live broadcast, digital employees, etc., also includes B-side privatization deployment business for financial, medical, government and enterprise enterprises. The company's virtual human real-time interaction and response system "Yun Xiaoqi", the Yuanverse e-commerce and virtual human live broadcast tool "Treasure Box Auto Broadcast & Virtual Assist Broadcast", and the AI rapid animation generation system "Automatic Animation" have all been launched.
Currently, Zhongke Shenzhi’s “auto-broadcast” customers account for more than 70% of the middle-waist brand merchants using AI virtual humans on Tmall, Taobao, JD.com and other platforms. They recently launched a virtual assistant product that allows real people and virtual people to appear at the same time.
"Our goal is to create end-to-end virtual human work. After the user inputs text, it can be directly and automatically modeled into a 3D character, and action expressions can be generated in real time to form content output." Cheng Weizhong pointed out that the company should provide enterprises with Full link solution.
A label that Maijike Technology gives itself is "cross-modal". The company focuses on the fields of intelligent digital assets and intelligent generation. Its main products are intelligent content production, intelligent virtual live broadcast, and personalized intelligence for thousands of people. Interaction, as well as the new product recently released at the Zhongguancun Forum - Digital Intelligence Space Station, three-dimensional, real-time and intelligence are its highlights.
As early as 2016, Maijike Technology began to enter the AIGC field, founder and chairman Fu Yingna said.
"We do not use manual calibration of data corpus, but build data based on unstructured technology. For example, we can hierarchically process different data documents in the enterprise, build a cross-modal search engine for the enterprise, and make enterprise data intelligent Generate content and easily interact.”
There are many companies doing privatization deployment for enterprises, but it is difficult for small and medium-sized enterprises to build large models. Fu Yingna believes that their moat lies in combinatorial innovation based on open source large models. "In fact, algorithms and models can be applied in parallel or in series." Combination. The underlying technology of Maijike Technology is a hierarchical algorithm. Such algorithm combination requires low computing power and can be constructed with low cost and high efficiency. It iterates resources based on small sample data and ultimately forms intelligence that can evolve."
Virtual human live broadcasts are mainly divided into three types: entertainment broadcasts, game broadcasts, and e-commerce live broadcasts. Yunbo’s Mei Song believes, “From a value perspective, virtual humans are more suitable for the first two. The core of the goods-carrying scenario is the goods. As long as something is cheap, someone will definitely buy it. If something is expensive, it will be difficult for anyone to sell it. Virtual people Live streaming cannot solve the problem of goods."
"Virtual live broadcasts can improve the inability to interact with fans instantly during live broadcasts. Fans can reward virtual anchors and change their clothes at any time, and the virtual live broadcast content is more interactive and rich." Mei Song said, "Gao Quality live content definitely requires real people to participate.”
Open Station B and you can see Xiao K’s “tap water” users everywhere. Mei Song revealed that Xiao K Live Ji's market retention rate in the field of 3D virtual anchors has reached 90%, with more than 400,000 anchors serving the entire network, and more than 5,000 anchors with daily active broadcasts. Among all the motion capture live broadcast tools, Xiao K is the only one that is completely free on the C-side.
"I don't expect to rely on Xiao K Live Ji's product to charge money. In the future, I hope to use this product to attract more anchors to use it. These people have their own fans and traffic. Later, I can use space scenes to make interactive games , to monetize content traffic, similar to the intermodal model of games."
Mei Song believes that virtual live interaction will be divided into three stages. The 1.0 era was a daily live broadcast with real people, giving gifts, playing special effects, and having barrages. 2.0 Fans can influence the virtual content in the live broadcast room or create characters through barrage gifts, but fans have no control. 3.0 is the metaverse space, where fans and anchors can freely interact in live broadcasts. Fans have virtual joysticks and have complete independent control, such as holding concerts, playing PK, etc. In the second half of this year, Yunbo will launch the "Little K Space Station" similar to the "Metaverse" to implement the third generation of live interactive scenes.
Fu Yingna of Maijike Technology believes that different platforms have different policies for virtual live broadcasts. "Douyin will not encourage it, but Bilibili will probably encourage it. After all, the users are different. In the future, the platform can have a separate virtual live broadcast area, after all, there is an audience."
On the other hand, "In the long run, if virtual people replace real people, they are trying to exploit the platform and infringe on the interests of the platform. They will definitely be banned in the future." Cheng Weizhong, a wise man from Zhongke, said that the previous digital people brought goods The live broadcast became popular because Douyin wanted to support local life.
"In the future, virtual people's live broadcasts must be in a win-win situation with the platform. Virtual person technology suppliers should think clearly about what the platform, users, and businesses need. Virtual anchors should have reasoning and analysis capabilities that real people do not have. Etc., for example, it can instantly analyze and judge whether the atmosphere of the barrage is positive or not. This is the value of artificial intelligence."
Speaking of large models, Cheng Weizhong said: "Training deep AI requires a large amount of corpus, which are all existing application scenarios of OpenAI. More importantly, when deep AI technology can break through the threshold, can it also have New application scenarios. In the same way, the large model we are building is also seeking to achieve a threshold breakthrough in virtual humans, and it is expected to make progress in virtual human scene interaction in one year."
In addition to the ChatGPT wave, Cheng Weizhong has also been paying attention to Apple Glasses. "Whether Apple's AR glasses can go forward in the long run depends not only on whether it can integrate AR and VR technically, but more importantly on the business model, whether it can share money with developers, so as to attract developers to join in and create an ecosystem together. ”
"With the development of AR glasses, the way content is expressed will also change, which will also bring greater opportunities to companies that do 3D content generation and virtual humans. It is a good opportunity for entrepreneurs."
Talking about Apple Glasses, Mei Song believes that the best early implementation scenarios for VR and the Metaverse are in games. “Why the penetration rate and number of users of VR have not increased? The core reason is that there is no good content based on VR.”
Yunbo invested 30 million yuan to create the 3D assets in Xiao K Live Broadcast Ji Zhong. “On the one hand, it is the cost. On the other hand, we have self-developed technologies such as the Xiao K video engine. It took 6 years to form We have built our own moat.”
"In the second half of the year, we will start the development of AIGC multi-modal conversion products, such as AI music and AI voice; Xiao K Live Ji will launch a single-camera full-body motion capture version; and the products will also be integrated into the game animation production process. Help users generate usable animation data; in addition, e-commerce and game versions of Little K’s drawings will also be launched.”
Of course these plans also require the support of large models, so Yunbo will also use a large amount of data to train the model. The company already has a labeling team of dozens of people.
Fu Yingna also used the term upgrading to describe the situation after AR/VR becomes popular. "In the first generation of the Internet, every company has its own website. In the second generation of the Internet, every company has its own APP. The third generation of the Internet will be a four-dimensional space-time experience, and will be upgraded to the intelligent generation of 3D content, integrating time and Space folding enables real-time intelligent interaction. 3D content can be output in H5 format in a lightweight manner, and can appear simultaneously on web pages, clients, and other places for real-time interaction."
The latest "Digital Space Station" created by Maijike Technology can establish a "digital business card" for enterprises in the virtual space. It can be presented three-dimensionally, visually and intelligently, transcending the limitations of real physical time and space, and providing users with one-to-one , personalized intelligent services for thousands of people. This content operation can be achieved through private deployment or SaaS. This is the key to building a new generation of content productivity and immersive experience. It is also an important tool and platform in the digital economy era.
While AIGC technology continues to innovate industry perceptions, it also brings a revolution to virtual human technology. For virtual human technology practitioners, the AI craze triggered by ChatGPT, coupled with the new hot spots of AR/VR, has generated double excitement.
When there are too many new things to see and too many things to do, how to use AI to help generate endless content consumer products and open the door to a new world, I am afraid it is technology providers and B-side customers problems that we need to solve together.
The above is the detailed content of Apple's headset is coming, is it the savior of AI virtual humans or a meteor?. For more information, please follow other related articles on the PHP Chinese website!