Home >Technology peripherals >AI >Machine learning research in acoustics could unlock a multimodal metaverse
Researchers at MIT and the IBM Watson AI Lab created a machine learning model that predicts what listeners will hear at different locations within 3D space.
The researchers first used this machine learning model to understand how any sound in a room travels through space, building a 3D picture of the room in the same way that people understand their environment through sound.
In a paper co-authored by Yilun Du, a graduate student in MIT's Department of Electrical Engineering and Computer Science (EECS), researchers show how techniques similar to visual 3D modeling can be applied to acoustics .
But they have to face the difference in sound and light propagation. For example, due to obstacles, the shape of the room, and the characteristics of the sound, listeners at different locations in the room may have very different impressions of the sound, making the results unpredictable.
To solve this problem, the researchers built acoustic features into their model. First, all other things being equal, swapping the positions of the sound source and listener does not change what the listener hears. Sound is also particularly affected by local conditions, such as obstacles between the listener and the source of the sound.
Du said: "So far, most researchers have only focused on visual modeling. But as humans, we have multiple modes of perception. Not only vision is important, but sound is also important. I think this The work opens up an exciting research direction to better use sound to simulate the world."
Using this method, the generated neural acoustic field (NAF) model is able to randomize points on the grid. Sampling to understand the characteristics of a specific location. For example, being close to a door can greatly affect what the listener hears from the other side of the room.
The model is able to predict what a listener is likely to hear from a specific acoustic stimulus based on the listener's relative position in the room.
The paper states: “By modeling acoustic propagation in a scene as a linear time-invariant system, NAF learns to continuously map the positions of the emitter and listener to neural impulse response functions, which can be applied for any sound." "We demonstrate that the continuity of NAF allows us to render spatial sound to listeners at any location and predict the propagation of sound in new locations."
MIT-Chief of IBM Watson AI Lab Researcher Chuang Gan, who also worked on the project, said: "This new technology may bring new opportunities for creating multi-modal immersive experiences in Metaverse applications."
We know that not all Reg Readers will be excited about this use case.
The above is the detailed content of Machine learning research in acoustics could unlock a multimodal metaverse. For more information, please follow other related articles on the PHP Chinese website!