Home  >  Article  >  Technology peripherals  >  Can the hot generative AI bring smart speakers back to life?

Can the hot generative AI bring smart speakers back to life?

王林
王林forward
2023-06-06 08:05:39984browse

Products such as smart speakers, which have been almost forgotten by a large number of consumers, have long been no longer the focus of most consumers after experiencing the "craziness" of 2017 and 2018. Just when everyone thought that smart speakers would be a flash in the pan, the emergence of ChatGPT seemed to give smart speakers a second chance, and also gave this declining industry a new opportunity. So, can smart speakers and the now popular generative AI create sparks?

Can the hot generative AI bring smart speakers back to life?

For the smart speaker industry, generative AI may be like the rain after a long drought. According to relevant market survey data, in the first quarter of 2023, due to the combined effects of factors such as severe product homogeneity and declining consumer demand, the online monitored retail sales of domestic smart speakers was 1.57 million units, another drop of 40.6%, while Throughout 2022, domestic omni-channel sales of smart speakers were 26.31 million units, a year-on-year decrease of 28%.

Why have smart speakers, which once had high hopes from major giants and were even regarded as possible entrances to smart homes, slipped into the abyss in recent years? There is actually only one reason, and that is that smart speakers are really not smart enough.

In 2017, when the concept of smart speakers was hot, there was a discussion in the industry about whether the focus of smart speakers should be "intelligence" or "sound quality". In the end, a series of products that focused on sound quality, such as Tencent Listening and Apple HomePod, used their tragic failures to prove that the selling point of smart speakers can only be intelligence.

Can the hot generative AI bring smart speakers back to life?

Unfortunately, however, the intelligence level of most smart speakers can only be described as "stretched". However, major manufacturers have limited attention to artificial intelligence and artificial intelligence such as ASR (speech recognition), NLP natural semantic processing, and far-field sound pickup. The progress of acoustic technology is indeed a bit too optimistic. In fact, the smart speaker is very simple from a technical perspective. Its working mode is to collect the user's voice, then send the audio to the server, then calculate and produce the results, and finally send the results to the smart speaker to turn into specific behaviors. For example, open an application or reply to a user's question.

Yes, the smart speakers themselves have nothing to do with artificial intelligence. The real bodies of Xiaoai, Xiaodu, and Tmall Elf are hidden on the corresponding servers. All this also leads to the fact that the key to determining the experience of smart speakers is far-field sound pickup technology, which is the ability to accurately capture the user’s voice commands in complex acoustic environments. After all, the user cannot say “tell a joke” and the smart speaker listens Let’s call it “playing a song”.

Can the hot generative AI bring smart speakers back to life?

The solution for smart speakers is to use a large-scale microphone array to collect sound, but there is one pain point that has not been solved, and that is voice wake-up (keyword spotting). When you use smart speakers, you need to use wake-up words such as "Hi, Siri", "Xiao Ai Classmate", and "Xiaodu Xiaodu" to let the smart speaker know that you are talking to it, which means that smart speakers The speaker lacks the ability to actively serve. More importantly, due to technical limitations, smart speakers have long been able to understand only simple instructions, such as "turn the volume up/down", "play so-and-so's song by so-and-so", and more complex instructions. Sentence recognition is often difficult.

The significance of generative AI such as ChatGPT and Wen Xinyiyan to smart speakers is that the former can help smart speakers understand more complex sentences and provide more natural communication. I believe friends who have used Microsoft Bing Chat, Baidu Wenxinyiyan or ChatGPT should know that when talking to this type of generative AI, there is no need to use an opening statement such as "Hi, ChatGPT", you can start by directly typing the content. dialogue process.

Can the hot generative AI bring smart speakers back to life?

Because generative AI is based on large-scale language model (LLM, Large Language Model), it adds manual annotation data and reinforcement learning technology from human feedback, and is supplemented by knowledge graph technology, which is a Writing knowledge into multi-relationship diagrams of structured triples (including entities, concepts and relationships) allows AI to understand the meaning of human instructions and ultimately select content from a huge information database to answer.

The biggest change in products like ChatGPT compared to Siri and Xiaoai is the ability to have multiple rounds of conversations. Compared with Siri, which is almost like a "fish memory", ChatGPT can always talk to users. Coupled with a clearer perception of emotions, users feel that they are really talking to a living person. For a consumer product, users obviously don’t care how advanced the technical principles behind it are, but rather whether it can solve problems or meet needs.

Can the hot generative AI bring smart speakers back to life?

The charm of generative AI lies in its high upper limit of capabilities. A typical example is Microsoft Copilot. At the same time, it can also meet the social needs of users to a certain extent. Now there are creators overseas using ChatGPT. , launched a "virtual companion" modeled after himself and gained more than 1,000 users. In general, combining generative AI with smart speakers can almost make up for the shortcomings of the latter, giving it a level of intelligence that can be used in the consumer market.

In fact, some smart speaker manufacturers have already taken action. For example, in February this year, when Baidu was warming up Wen Xin Yi Yan, Xiaodu had already announced that it would integrate Wen Xin Yi Yan to create the AI ​​model "Xiaodu Lingji" for smart device scenarios; in April, Tmall Genie accessed "Niaodu" The "AI mouth replacement" created by the "bird divides the bird" model also announced its access to Alibaba's Tongyi Qianwen.

Can the hot generative AI bring smart speakers back to life?

But it needs to be pointed out that generative AI is not a "panacea." For now, all generative AI faces an inevitable problem, which is the scarcity of computing resources. The recent news that the generative AI ceiling GPT-4 has become dumber has attracted a lot of attention. Compared with the state when it was first released, it has become a consensus among users that the quality of GPT-4's text code has declined in all aspects.

Yes, in fact, not only GPT-4, but also public-facing products such as ChatGPT and Wenxinyiyan have experienced similar situations. The increase in the number of users has led to a decline in the performance of large models.

The core problem facing the field of generative AI now is that computing resources are tight and unable to cope with the influx of users. In order to ensure user experience, such products can only reduce the performance of large models and reduce the time to generate content. To "reduce the load" on the server. In comparison, the existing market size of smart speakers is undoubtedly larger, so after accessing generative AI, it is almost inevitable to encounter similar problems.

Can the hot generative AI bring smart speakers back to life?

What is likely to happen in the future is that the intelligence level of smart speakers will show a parabola. The initial user experience will improve by leaps and bounds, but as the number of users continues to increase, the intelligent performance may "degrade" Back to the level of a few years ago.

The above is the detailed content of Can the hot generative AI bring smart speakers back to life?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:sohu.com. If there is any infringement, please contact admin@php.cn delete