Facebook Voice-Separation Has Disrupted The Voice Industry

0 Shares
0
0
0

Facebook voice-separation has become a monumental milestone in the history of voice technology. In the era of a rat race where everyone’s competing with everyone else out there, e-commerce giant Facebook decided to take a huge leap in the voice-tech industry. Facebook has come up with an AI that can distinguish between five different voices in one go. Yes, you and your bunch of friends can be speaking into one microphone, and Facebook shall know who’s who.

The researchers explained the Facebook voice-separation AI model in a paper named “Voice Separation” with an Unknown Number of Multiple Speakers. Facebook is supposed to present the same in the Internation Conference on Machine Learning this year.

How does the Facebook voice-separation AI model work?

The able scientists have educated the AI to differentiate between voices. The AI’s recurrent neural networks simulate the memory and analyze the audio to determine how many voices are present in one go.

The Facebook voice-separation AI model makes use of an encoder network that maps raw audio waveforms to a latent representation. These latent representations are then separated out by a voice separation network.

However, the encoder needs some knowledge input beforehand regarding the number of speakers. A subsystem is thus programmed to automatically detect the speakers and select the speech model accordingly.

Practice Makes Perfect

They have undergone massive training where they were placed amidst two to five speakers with their microphones on, thus distinguish between all of them.

The researchers trained various models accordingly to work with two, three, or five different speakers. They fed the input mixture to the models so that they can work and recognize up to five different voices.

They repeated the above-mentioned process with a model trained to detect the number of active speakers and checked whether any output channels were active or not. They stopped the process whenever they found that all the channels were active, or when they found the model with the lowest target speakers.

Real-World Applications and Futuristic Hopes

The researchers believe that the Facebook voice-separation AI model can help people with hearing aids, as it could improve audio quality for them. However, they say they are yet to refine and tune the model more technologically to ensure that it achieves optimal high performance in today’s world.

The researchers are trying to develop the Facebook Voice-separation model such that the sound goes through in a much refined and louder form. Although hearing aids do help people listen to someone speaking, it does get tough for them at times when they are in a noise environment.

Isolation one voice and canceling out all other exterior noises shall prove to be extremely helpful for people with hearing aids, especially if they’re at a party or if the weather’s windy.

Setting a Trend in the Rat Race

This update shall also serve as a major upgrade for all other voice assistants. Once the voice-separation technology is refined, smart speakers can easily identify when the wake words were used, and by whom.

Getting confused with more than one noise has always been a major problem with smart voice assistants. Other technology giants such as Google, is also developing a similar technology to resolve this issue altogether.

They are coming up with a technology called “de-noiser”, aimed to filter out irrelevant sounds during a Google Meet. Startups like AudioIntellegence has had the privilege to be invested upon by such tech giants to develop an AI that differentiates a human voice from all others. Amazon recently qualified the Talkto Noise for Alexa built-in devices.

A Voice-Centric World

Facebook researchers are now trying to put their model into practical use, helping it adapt to real-word situations. The world’s running faster than ever, with voice leading the race.

Loading

0 Shares
You May Also Like