Technology peripherals

Let's talk about AI noise reduction technology in real-time communication

Let's talk about AI noise reduction technology in real-time communication

Apr 12, 2023 pm 01:07 PM

aideep learning

Let's talk about AI noise reduction technology in real-time communication

Part 01 Overview

##In real-time audio and video communication Scenario, when the microphone collects the user's voice, it also collects a large amount of environmental noise. The traditional noise reduction algorithm only has a certain effect on stationary noise (such as fan sound, white noise, circuit noise floor, etc.), and has a certain effect on non-stationary transient noise (such as a noisy restaurant). Noise, subway environmental noise, home kitchen noise, etc.) The noise reduction effect is poor, seriously affecting the user's call experience. In response to hundreds of non-stationary noise problems in complex scenarios such as home and office, the ecological empowerment team of the Department of Integrated Communications Systems independently developed AI audio noise reduction technology based on the GRU model, and through algorithm and engineering optimization, reduced the size of the noise reduction model. Compressed from 2.4MB to 82KB, the running memory is reduced by about 65%; the computational complexity is optimized from about 186Mflops to 42Mflops, and the running efficiency is improved by 77%; in the existing test data set (in the experimental environment), human voice and noise can be effectively separated , improving the call voice quality Mos score (average opinion value) to 4.25.

#This article will introduce how our team does real-time noise suppression based on deep learning and implement it on mobile terminals and Jiaqin APP. The full text will be organized as follows, introducing the classification of noise and how to choose algorithms to solve these noise problems; how to design algorithms and train AI models through deep learning; finally, it will introduce the effects and key applications of current AI noise reduction. Scenes.

Part 02 Noise classification and noise reduction algorithm selection

In real-time audio and video application scenarios, the device is in a complex acoustic environment. When the microphone collects voice signals, it also collects a large amount of noise, which is a very big challenge to the quality of real-time audio and video. There are many types of noise. According to the mathematical statistical properties of noise, noise can be divided into two categories:

Stationary noise: Statistics of noise Characteristics will not change over time over a relatively long period of time, such as white noise, electric fans, air conditioners, car interior noise, etc.;

Lets talk about AI noise reduction technology in real-time communication

Lets talk about AI noise reduction technology in real-time communication

##Non-stationary noise: The statistical characteristics of noise change over time, such as noisy restaurants, subway stations, offices, homes Kitchen etc.

Lets talk about AI noise reduction technology in real-time communication

Lets talk about AI noise reduction technology in real-time communication

In real-time audio and video applications, calls are susceptible to various types of noise interference This affects the experience, so real-time audio noise reduction has become an important function in real-time audio and video. For steady noise, such as the whirring of air conditioners or the noise floor of recording equipment, it will not change significantly over time. You can estimate and predict it and remove it through simple subtraction. Common There are spectral subtraction, Wiener filtering and wavelet transform. Non-stationary noises, such as the sound of cars whizzing by on the road, the banging of plates in restaurants, and the banging of pots and pans in home kitchens, all appear randomly and unexpectedly, and it is impossible to estimate and predict them. fixed. Traditional algorithms are difficult to estimate and eliminate non-stationary noise, which is why we use deep learning algorithms.

Part 03 Deep Learning Noise Reduction Algorithm Design

Lets talk about AI noise reduction technology in real-time communication

In order to improve the noise reduction capabilities of the audio SDK for various noise scenes and make up for the shortcomings of traditional noise reduction algorithms, we developed an AI noise reduction module based on RNN, combined with traditional noise reduction technology and deep learning technology. Focusing on noise reduction processing for home and office usage scenarios, a large number of indoor noise types are added to the noise data set, such as keyboard typing in the office, friction sounds of desks and office supplies being dragged, chair dragging, and kitchens at home. Noises, floor slams, etc.

#At the same time, in order to implement real-time speech processing on the mobile terminal, the AI audio noise reduction algorithm controls the computational overhead and library size to a very low level. magnitude. In terms of computational overhead, taking 48KHz as an example, the RNN network processing of each frame of speech only requires about 17.5Mflops, FFT and IFFT require about 7.5Mflops of each frame of speech, and feature extraction requires about 12Mflops, totaling about 42Mflops. The computational complexity is approximately The 48KHz Opus codec is equivalent. In a certain brand of mid-range mobile phone models, statistics indicate that the RNN noise reduction module CPU usage is about 4%. In terms of the size of the audio library, after turning on RNN noise reduction compilation, the size of the audio engine library only increases by about 108kB.

Part 04 Network model and processing process

The The module uses the RNN model because RNN carries time information compared to other learning models (such as CNN) and can model timing signals, not just separate audio input and output frames. At the same time, the model uses a gated recurrent unit (GRU, as shown in Figure 1). Experiments show that GRU performs slightly better than LSTM on speech noise reduction tasks, and because GRU has fewer weight parameters, it can save computing resources. Compared to a simple loop unit, a GRU has two extra gates. The reset gate control state is used to calculate the new state, while the update gate control state is how much it will change based on the new input. This update gate allows GRU to remember timing information for a long time, which is why GRU performs better than simple recurrent units.

Lets talk about AI noise reduction technology in real-time communication

## Figure 1 The left side is a simple cyclic unit, the right side The structure of the GRU

model is shown in Figure 2. The trained model will be embedded into the audio and video communication SDK. By reading the audio stream of the hardware device, the audio stream will be framed and sent to the AI noise reduction preprocessing module. The preprocessing module will add the corresponding features ( Feature) is calculated and output to the trained model. The corresponding gain (Gain) value is calculated through the model, and the gain value is used to adjust the signal to ultimately achieve the purpose of noise reduction (as shown in Figure 3).

Lets talk about AI noise reduction technology in real-time communication

##Figure 2. GRU-based RNN network model

Lets talk about AI noise reduction technology in real-time communication

## Figure 3. The top is the model training process, and the bottom is the real-time reduction Noise process

Part 05 AI noise reduction processing effect and implementation

Figure 4 shows the keystrokes Comparison of the speech spectrograms before and after noise reduction. The upper part is the noisy speech signal before noise reduction, and the red rectangular box is the keyboard tapping noise. The lower part is the speech signal after noise reduction. Through observation, it can be found that most of the keyboard tapping sounds can be suppressed, while the speech damage is controlled to a low level.

Lets talk about AI noise reduction technology in real-time communication

## Figure 4. Noisy speech (accompanied by Keyboard tapping sound) before and after noise reduction

The current AI noise reduction model has been launched on the mobile phone and Jiaqin to improve the mobile phone and Jiaqin APP The call noise reduction effect has excellent suppression capabilities in more than 100 noise scenarios in homes, offices, etc., while maintaining voice distortion. In the next stage, we will continue to optimize the computational complexity of the AI noise reduction model so that it can be promoted and used on IoT low-power devices.

The above is the detailed content of Let's talk about AI noise reduction technology in real-time communication. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Related Article

How to Build Your Personal AI Assistant with Huggingface SmolLM

How to Build Your Personal AI Assistant with Huggingface SmolLMApr 18, 2025 am 11:52 AM

Harness the Power of On-Device AI: Building a Personal Chatbot CLI In the recent past, the concept of a personal AI assistant seemed like science fiction. Imagine Alex, a tech enthusiast, dreaming of a smart, local AI companion—one that doesn't rely

AI For Mental Health Gets Attentively Analyzed Via Exciting New Initiative At Stanford University

AI For Mental Health Gets Attentively Analyzed Via Exciting New Initiative At Stanford UniversityApr 18, 2025 am 11:49 AM

Their inaugural launch of AI4MH took place on April 15, 2025, and luminary Dr. Tom Insel, M.D., famed psychiatrist and neuroscientist, served as the kick-off speaker. Dr. Insel is renowned for his outstanding work in mental health research and techno

The 2025 WNBA Draft Class Enters A League Growing And Fighting Online Harassment

The 2025 WNBA Draft Class Enters A League Growing And Fighting Online HarassmentApr 18, 2025 am 11:44 AM

"We want to ensure that the WNBA remains a space where everyone, players, fans and corporate partners, feel safe, valued and empowered," Engelbert stated, addressing what has become one of women's sports' most damaging challenges. The anno

Comprehensive Guide to Python Built-in Data Structures - Analytics Vidhya

Comprehensive Guide to Python Built-in Data Structures - Analytics VidhyaApr 18, 2025 am 11:43 AM

Introduction Python excels as a programming language, particularly in data science and generative AI. Efficient data manipulation (storage, management, and access) is crucial when dealing with large datasets. We've previously covered numbers and st

First Impressions From OpenAI's New Models Compared To Alternatives

First Impressions From OpenAI's New Models Compared To AlternativesApr 18, 2025 am 11:41 AM

Before diving in, an important caveat: AI performance is non-deterministic and highly use-case specific. In simpler terms, Your Mileage May Vary. Don't take this (or any other) article as the final word—instead, test these models on your own scenario

AI Portfolio | How to Build a Portfolio for an AI Career?

AI Portfolio | How to Build a Portfolio for an AI Career?Apr 18, 2025 am 11:40 AM

Building a Standout AI/ML Portfolio: A Guide for Beginners and Professionals Creating a compelling portfolio is crucial for securing roles in artificial intelligence (AI) and machine learning (ML). This guide provides advice for building a portfolio

What Agentic AI Could Mean For Security Operations

What Agentic AI Could Mean For Security OperationsApr 18, 2025 am 11:36 AM

The result? Burnout, inefficiency, and a widening gap between detection and action. None of this should come as a shock to anyone who works in cybersecurity. The promise of agentic AI has emerged as a potential turning point, though. This new class

Google Versus OpenAI: The AI Fight For Students

Google Versus OpenAI: The AI Fight For StudentsApr 18, 2025 am 11:31 AM

Immediate Impact versus Long-Term Partnership? Two weeks ago OpenAI stepped forward with a powerful short-term offer, granting U.S. and Canadian college students free access to ChatGPT Plus through the end of May 2025. This tool includes GPT‑4o, an a

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Will R.E.P.O. Have Crossplay?

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

Hot Topics

Where is the login entrance for gmail email?

7554

15

CakePHP Tutorial

1382

52

What is the format of the account name of steam

83

11

win11 activation key permanent

59

19

nyt connections hints and answers

28

96