Home >Technology peripherals >It Industry >Calls in elevators and basements are smooth without lag. Tencent leads the new generation of real-time speech coding industry standard AVS3P10, which will be released soon.

Calls in elevators and basements are smooth without lag. Tencent leads the new generation of real-time speech coding industry standard AVS3P10, which will be released soon.

王林Original: 2024-06-27 17:45:061396browse

According to news from this site on June 27, Tencent announced today that the new generation of real-time speech coding industry standard AVS3P10 led by the company has been finalized and will be officially released soon. This time, the AVS audio group AVS3P10 standard adopts Tencent's solution and is based on Tencent Conference's first self-developed neural network voice codec Penguins AI voice engine, which can improve call quality in weak network environments.

电梯、地库里通话不卡顿，腾讯主导新一代实时语音编码行业标准 AVS3P10 即将发布

Tencent said that this is the world's first system to introduce artificial intelligence and achieve high-quality speech coding standards at low bit rates, and its performance has reached world-class standards. With only 1/3 of the encoding bit rate, it can achieve the same clear sound quality as existing mainstream standards. "Even if the network card is 2G, you can still have a smooth meeting."

This standard was initiated, promoted and maintained by Tencent, with joint contributions from multiple members of the AVS audio group. "In the future, the bandwidth requirements for real-time audio scenarios such as online meetings and voice calls will be greatly reduced. Even in environments with poor networks such as elevators, basements, and tunnels, clear and smooth voice calls can be achieved."

电梯、地库里通话不卡顿，腾讯主导新一代实时语音编码行业标准 AVS3P10 即将发布

According to reports, in Under limited bandwidth conditions, if you want to deliver high-quality sound to the receiver, speech coding technology that compresses the original data and removes redundant information is the key. However, based on existing mainstream audio coding and decoding standards such as EVS and OPUS, when the bit rate is reduced below 10kbps, the voice quality drops significantly, affecting the user experience.

To deal with this challenge, Tencent Conference Tianlai Lab and Tencent AI Lab independently developed Tencent’s first neural network speech codec - Penguins.

Specifically, Penguins integrates AI with traditional technologies, breaks the performance limits of traditional Shannon's law, introduces big data and provides a new performance upper bound under controllable computing power increments, thereby supporting the next generation of communication systems, especially It is the source encoder part, which provides new technical foundation and methodology. Through AI speech signal modeling, the core feature parameter encoding is extracted, and then with the help of deep learning network, the subtle structure in the speech is predicted and reconstructed, and finally a realistic audio waveform is generated.

Multiple tests show that the AVS3P10 standard submitted by Tencent achieves high-quality voice communication at 6kbps. It can achieve clear calls even under "2G" networks, and the subjective quality is very close to the original reference signal, comparable to the internationally mainstream OPUS standard. 20kbps quality. At the same time, when the subjective quality is compared with traditional encoding at medium and high bit rates, the encoding efficiency is increased by 200-300%.

Starting in 2021, Penguins audio encoders have been put into large-scale applications in Tencent conference driving mode, weak network mode, QQ voice calls and other scenarios.

In March 2023, the Tencent team proposed and participated in the formulation of the standard in the AVS audio group, namely the AVS3P10 real-time speech coding standard. Subsequently, Tencent submitted a candidate technology based on Penguins; it was adopted after cross-validation by the AVS audio group. In June 2024, the AVS3P10 real-time speech coding standard officially completed the standardization work and entered the public announcement stage.

Note from this site: Since the establishment of the AVS working group in my country in June 2002, after more than ten years of team efforts of thousands of people, AVS with independent intellectual property rights in my country came into being. AVS3 is the world's first launched video encoding standard for 8K and 5G industrial applications.

AVS has started the standard formulation of AVS4 and calls on AVS member units to continue to support the development of AVS’s next-generation standards. Various manufacturers will join forces to jointly implement technical standards and promote global deployment.

The above is the detailed content of Calls in elevators and basements are smooth without lag. Tencent leads the new generation of real-time speech coding industry standard AVS3P10, which will be released soon.. For more information, please follow other related articles on the PHP Chinese website!

人工智能

Statement：

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Previous article：Apple’s flagship phone’s U.S. market share surges, leading Samsung by 52.4%Next article：Apple’s flagship phone’s U.S. market share surges, leading Samsung by 52.4%

See more

Calls in elevators and basements are smooth without lag. Tencent leads the new generation of real-time speech coding industry standard AVS3P10, which will be released soon.

Related articles