Hearing Only What You Want to Hear

July 5, 2024

Technology

'SoundBeam' is an innovative AI technology from NTT designed to concentrate on and isolate specific sounds that a user wishes to listen to. This deep learning solution enables what can be described as "computational selective hearing." Just as humans have the unique ability to concentrate on a particular sound within a noisy environment, SoundBeam aims to replicate the ability electronically.

NTT's work extends earlier research that primarily focused on recognizing and highlighting specific speech patterns based on the distinct characteristics of a speaker's voice. However, SoundBeam takes it a step further and gives users the power to select any sound they want to hear, based on their preferences or surroundings. For instance, it can be set up to recognize the blaring of a car horn when someone is crossing the street, while the same sound can be muted when the person is at home, ensuring both comfort and safety.

Every day, we find ourselves amidst a cacophony of overlapping sounds. Depending on our circumstances and intent, the same sound can either be perceived as vital information or just intrusive noise. This human capability to hone in on specific auditory stimuli, filtering out the rest, is what is termed selective hearing.

The primary aim of the SoundBeam project is to create a technological solution that can extract and focus on sounds from a particular category as defined by the user, even when these sounds are embedded within a complex auditory scene. The end goal is to achieve a system capable of computational selective hearing that can identify and spotlight any sound, regardless of its nature. What's more, the technology will potentially allow users the flexibility to determine the sounds they want to be emphasized or muted, based on their personal preferences or the context they are in.

Building upon the foundation set by NTT's SpeakerBeam, which is designed for selectively processing speech, SoundBeam is capable of handling a broader spectrum of sounds. At the heart of SoundBeam is a neural network, a type of advanced computational model. This neural network is educated using a diverse set of auditory data, which includes simulated combinations of various sounds, the labels for the desired sound categories, and the specific sound signals that are of interest. The model can be adapted to focus on different sound categories simply by adjusting the predefined sound class, showcasing its versatility.

There are multiple practical uses for SoundBeam. In the realm of professional audio editing, SoundBeam can be an invaluable tool for post-production, allowing editors to emphasize or reduce specific sounds in a mix. Additionally, the technology can be integrated into listening devices, such as headphones or hearing aids, providing users with the ability to control the sounds they hear in their environment and customizing their auditory experience.

NTT—Innovating the Future of Sound

If you have any questions on the content of this article, please contact:
NTT Service Innovation Laboratory Group
Public Relations
nttrd-pr@ml.ntt.com

Daniel O'Connor

Daniel O'Connor joined the NTT Group in 1999 when he began work as the Public Relations Manager of NTT Europe. While in London, he liaised with the local press, created the company's intranet site, wrote technical copy for industry magazines and managed exhibition stands from initial design to finished displays.

Later seconded to the headquarters of NTT Communications in Tokyo, he contributed to the company's first-ever winning of global telecoms awards and the digitalisation of internal company information exchange.

Since 2015 Daniel has created content for the Group's Global Leadership Institute, the One NTT Network and is currently working with NTT R&D teams to grow public understanding of the cutting-edge research undertaken by the NTT Group.

Search by Tags

Group Companies