my country has made important progress in formulating the AVS3 real-time voice standard, and Tencent's solution was selected-It Industry-php.cn

my country has made important progress in formulating the AVS3 real-time voice standard, and Tencent's solution was selected

王林

Dec 15, 2023 am 10:57 AM

voiceavs3Tencent Standard

According to official news from the New Generation Artificial Intelligence Alliance, the AVS3P10 real-time speech coding standard has made important progress recently. The news was released on this site on December 14.

On December 14, 2023, the 87th AVS Working Group The conference opened in Chengdu. At the meeting, "Intelligent Media Coding Part 10 Real-time Speech" (hereinafter referred to as AVS3P10) WD 1.0 was reviewed by the plenary meeting; The technical solution submitted by Tencent was selected as the RM0 baseline of AVS3P10 real-time speech coding.

my country has made important progress in formulating the AVS3 real-time voice standard, and Tencents solution was selected

Real-time voice communication technology (this site’s note: RTC, Real-time Communication) has been widely used in collaborative office, interactive entertainment, social networking, etc. field. The above-mentioned diverse and rich application scenarios pose a variety of technical challenges to real-time voice communication technology. Among them, high-quality, low-latency, low-bandwidth, and high-resistance voice coding is a very important part.

At a code rate of 16-20kbps, traditional speech coders such as AVS and ITU-T standards can produce high-quality broadband speech. At 30-35kbps, they can generate high-quality ultra-wideband and even full-band voice. However, when the bit rate is further reduced (for example, below 10kbps), the recovery quality of the traditional speech encoder is significantly reduced, which has an impact on the user experience.

Based on the above application demands, in March this year, for the 84th time At the AVS meeting, Tencent proposed to launch a low-bitrate, high-quality voice system project for real-time voice communication scenarios in the AVS audio group. After demand analysis, at the 85th AVS meeting, AVS officially initiated the AV3P10 real-time speech coding project and issued a technical solicitation through the AVS audio group. The AVS3P10 real-time speech coding project will be promoted and maintained by Xiao Wei from Tencent Conference Teana Lab.

At the 86th AVS meeting, the audio group reviewed the M7886 "AVS3P10 Speech Coding Reference Model Candidate Technical Solution" proposal submitted by Tencent Conference Tianlai Laboratory

The review found that the solution has the following Four features:

Deeply integrates artificial intelligence technologies such as classic signal processing and deep neural network technology, and belongs to AI Codec;
supports low Code rate, high-quality encoding, real-time encoding and decoding and multi-rate encoding;
Based on sub-band encoding and multi-mode encoding architecture, low-frequency signals use deep neural networks to extract features, and high-frequency signals The frequency band expansion scheme is used to extract features, and scalar quantization and entropy coding are combined to complete feature compression;
has the technical characteristics of an open coding neural network architecture, and can ensure forward compatibility of the code stream Re-modify and optimize encoding neural networks.

#On November 1st this year, Tencent Conference Tianlai Lab submitted the executable file of the AVS3P10 RM0 candidate solution.

The China Electronics Technology Standardization Institute and Huawei conducted subjective testing and cross-validation respectively. The cross-validation strives to be comprehensive, based on the ITU-T P.800 DCR subjective quality evaluation system. The subjective test covers pure voice, packet loss voice, mixed voice and other scenarios under different bandwidths, and for the first time, the 3A processed test scenario is introduced into the source coding In the machine test, to test the performance of the new generation AI Codec technology in close to real scenarios.

In the above test scenario,

AVS3P10 RM0 has obvious quality advantages. Subjective test results show that AVS3P10 RM0 has achieved MOS points of more than 4.0 in multiple major test scenarios such as broadband and ultra-wideband, showing obvious advantages, with the lowest bit rate reaching 5.9kbps. AVS3P10 RM0 adopts deep neural network technology and has its own packet loss damage capability, which effectively improves the quality of the encoder when the network is poor.

In addition, in the ITU-T P.863 objective quality evaluation experiment, AVS3P10 RM0 also showed significant advantages. First of all, in all eight test bit rates, the MOS value of AVS3P10 RM0 exceeds 4.0, reaching a maximum of 4.45. The quality of AVS3P10 RM0 is comparable to the performance of traditional signal processing encoders such as OPUS and EVS at medium and high bit rates, reaching carrier-grade quality. In the field of AI codecs, AVS3P10 RM0 has a quality advantage of more than 0.6MOS at similar bit rates. The above test results show that AVS3P10 RM0 represents the highest level of current AI codecs

The New Generation Artificial Intelligence Alliance stated thatAVS3P10 real-time speech coding, as a new generation of speech coding and decoding technology standards, is the ideal An important addition to the AVS family of standards.

In the future, the AVS3P10 real-time speech coding project will be promoted according to the established plan, is expected to complete the standardization work in mid-2024.

Advertising Statement: This article contains external jump links (including but not limited to hyperlinks, QR codes, passwords, etc.), which are designed to provide more information and save screening time. The link results are for reference only. Please note that all articles on this site contain this statement

The above is the detailed content of my country has made important progress in formulating the AVS3 real-time voice standard, and Tencent's solution was selected. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:IT之家. If there is any infringement, please contact admin@php.cn delete

微信语音怎么转发微信的语音信息怎么转发Feb 22, 2024 pm 05:30 PM

只需要将语音转成笔记再发给别人即可。教程适用型号：iPhone13系统：iOS15.5版本：微信8.0.7解析1首先把语音信息添加到收藏，然后在收藏页面打开语音。2在语音界面点击右上角的三个点。3接着在下方的列表中点击转存为笔记。4最后在笔记界面点击发送给朋友即可。补充：微信语音怎么转文字1首先在微信聊天界面长按要转化的语音。2然后在弹出的窗口中点击转文字。3最后语音就被转换成文字了。总结/注意事项微信语音信息不能直接进行转发，需要先转成笔记的形式。

为什么微信语音听不到声音？微信语音听不到声音怎么办？Mar 13, 2024 pm 02:31 PM

　　为什么微信语音听不到声音？微信是我们日常生活中必不可少的一款通讯工具，其中不少的用户们在使用的过程中出现了问题，例如微信语音听不见声音？那么这要怎么办？下面就让本站来为用户们来仔细的介绍一下微信语音听不到声音怎么办吧。　　微信语音听不到声音怎么办　　1、手机系统设置的声音比较小或者处于静音状态，这种情况下可以调高音量或者关闭静音模式　　2、也有可能没有开启微信扬声器功能，打开“设置”，选择“聊天”选项。　　3、点击“聊天”选项过后

如何设置微信安卓版语音和视频通话来电铃声的简单4步Dec 30, 2023 pm 01:49 PM

在我们的日常生活和工作中，使用微信进行简单和重要的沟通已经成为每个人都会遇到的事情。同时，微信也成为了我们生活中不可或缺的沟通工具最近，一些使用安卓版微信的朋友遇到了一个问题。在给好友打微信电话时，不仅自己能听到好友的来电铃声，而且好友的微信来电铃声与其他人不同，不再是单调和枯燥的统一铃声。那么，安卓版微信如何设置语音和视频通话的来电铃声呢？下载本站小编来为大家介绍具体的方法，希望对有此需求的朋友们有所帮助如何在微信安卓版中设置来电铃声？打开微信界面，找到【我】选项并点击进入，然后找到【设置】选

如何在iPhone 15通话中使您的声音更清晰Nov 17, 2023 pm 12:18 PM

Apple的iPhone包括通话功能，即使在繁忙的环境中，也可以让您的声音在通话时更清晰地传达给正在与之交谈的人。它被称为语音隔离，这是它的工作原理。在iOS15及更高版本中，Apple包含多项功能，使使用FaceTime和其他视频通话应用程序进行视频会议在iPhone上更具吸引力。其中一项功能称为语音隔离，使人们可以更轻松地在视频通话中听到您的声音，并且在运行iOS16.4及更高版本的设备上，它也适用于常规电话。通话时，设备的麦克风通常会拾取环境中的各种声音，但通过语音隔离，机器学习可以区分这

小米手机微信语音来电不响怎么办Mar 02, 2024 am 11:40 AM

小米手机微信语音来电不响怎么办?在小米手机中是会出现微信电话不响的情况，但是多数的用户不知道小米手机如何解决微信电话不响的问题，接下来就是小编为用户带来的小米手机微信语音来电不响解决方法教程，感兴趣的用户快来一起看看吧！小米手机微信语音来电不响怎么办1、首先打开小米手机中的微信APP，进入到主页面点击右下角【我】选择【设置】;2、然后在设置页面中点击【新消息通知】功能;3、最后跳转到下图的页面，滑动【语音和视频通话提醒】即可解决。

如何实现C++中的语音识别和语音合成？Aug 26, 2023 pm 02:49 PM

如何实现C++中的语音识别和语音合成？语音识别和语音合成是当今人工智能领域中的热门研究方向之一，它们在很多应用场景中起到了重要的作用。本文将介绍如何使用C++实现基于百度AI开放平台的语音识别和语音合成功能，并提供相关的代码示例。一、语音识别语音识别是将人说的语音转换为文本的技术，其在语音助手、智能家居、自动驾驶等领域有着广泛应用。下面是使用C++实现语音识

如何在 iPhone 上使用实时语音邮件转录Nov 18, 2023 pm 04:03 PM

什么是实时语音邮件转录？实时语音邮件转录是iOS16中引入的一项创新功能，允许iPhone用户在离开语音邮件时查看语音邮件的实时转录。此功能利用先进的语音识别技术将口语转换为文本，提供了一种方便且易于访问的方式来了解最新消息，而无需完全收听它们。使用实时语音邮件转录的好处实时语音邮件转录为iPhone用户提供了几个优势：提高工作效率：通过提供实时转录，实时语音邮件转录消除了收听整个语音邮件的需要，从而节省了用户的时间和精力。这允许用户快速扫描语音邮件的内容并确定其响应的优先级。听力受损用户的可访

微信语音发送故障如何修复？解决微信语音发送问题的方法Jan 01, 2024 pm 12:19 PM

在使用微信这款聊天软件时，很多人都会遇到无法发送或接收微信语音的问题。下面，本文将为大家介绍一些解决方法。如果你对此感兴趣，就跟着小编一起来看看吧微信语音无法发送的问题的解决方法首先，打开手机上的设置。然后，点击隐私选项。在打开的页面中，找到麦克风选项并点击。接着，点击微信后面的开关按钮。这样，微信就可以发送语音信息了如何转发微信语音信息首先，您需要找到要转发的微信语音。然后，按住该微信语音消息，会出现一个转发选项。接下来，点击转发选项，在微信通讯录中找到要转发给的微信好友。最后，打开微信好友的

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How Long Does It Take To Beat Split Fiction?

1 months agoByDDD

R.E.P.O. Save File Location: Where Is It & How to Protect It?

1 months agoByDDD

R.E.P.O. Best Graphic Settings

2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

1 weeks agoByDDD

Hot Tools

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

SublimeText3 Chinese version

Chinese version, very easy to use

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Dreamweaver CS6

Visual web development tools

Hot Topics

Where is the login entrance for gmail email?

7390

1630

1357

1268

1216