SoundSpace – Facebook to Teach AI How to See and Hear

0 Shares
0
0
0

Facebook is making a unique tool public, which uses artificial intelligence to identify and reach the source of the sound. The tool is known as ‘SoundSpace’. It is a foundational tool that teaches embodied AI and robots to determine the source of the sound and navigate likewise.

The company recently announced milestones as to how its upcoming AI will be able to chalk out routes, identify its surrounding physical environments, know what’s happening, and will also be able to build memories in 3D space.

What is Embodied AI?

The concept of embodied AI stems from embodied recognition. It’s the basic theory upon which humans and other microorganisms’ brains function. Facebook is trying to apply this logic to make AI more humanely and interactive.

It is trying to implement this logic to better the performance of chatbots, AI robots, or smart speakers. Upon doing this, the AI will be able to identify whether the door is locked or not, and which corner of the house is the smartphone ringing.

Facebook even mentioned in their blog post, “By pursuing these related research agendas and sharing our work with the wider AI community, we hope to accelerate progress in building embodied AI systems and AI assistants that can help people accomplish a wide range of complex tasks in the physical world.

What is SoundSpace?

To perceive something appropriately, you need to visualize it. However, to be able to listen to it is equally important. If we’re able to see and well as hear, we can perceive an object better.

SoundSpace is basically a 3D environment that imitates rooms and other such spaces. Designed in such a way that it’s compatible with the AI Habitat, it provides software sensors that allow you to insert simulation of sounds from sources present in real-world environments.

As robots need quite specific and detailed instructions to carry them out in the desired manner, SoundSpace helps in just that. Its sensors and models can mimic a developing model of a room.

It then deciphers the data input to identify where the sound is coming from. Facebook wrote in a blog post, “To our knowledge, this is the first attempt to train deep reinforcement learning agents that both see and hear to map novel environments and localize sound-emitting targets.

With this approach, we achieved faster training and higher accuracy in navigation than with single modality counterparts. Unlike traditional navigation systems that tackle point-goal navigation, our agent doesn’t require a pointer to the goal location.

This means an agent can now act upon “go find the ringing phone” rather than “go to the phone that is 25 feet southwest of your current position.” It can discover the goal position on its own using multimodal sensing.”


What is AudioGoal?

In addition to the above, Sound Space also introduces us to a challenge called “Audio Goal”. In this, it requires an agent to move through the environment to identify the source of the sound.

This logic is imperative to teach AI how to identify and locate the source of the sound and navigate towards the same.  Facebook also advocates that it’s a much simpler and faster way of training an AI.

Facebook wrote in a blog post, “This AudioGoal agent doesn’t require a pointer to the goal location, which means an agent can now act upon ‘go find the ringing phone’ rather than ‘go to the phone that is 25 feet southwest of your current position.

It can discover the goal position on its own using multimodal sensing. Finally, our learned audio encoding provides similar or even better spatial cues than GPS displacements.

This suggests how audio could provide immunity to GPS noise, which is common in indoor environments.”

Rapid Developments

Facebook isn’t the only one trying to work on such conceptual models. Amazon is also coming up with an Alexa powered robot, while Samsung is trying to launch the Ballie and its own home navigation system.

Not only this, but voice technology has also made giant strides in the industry recently. Google is enlivening our lives by allowing Netflix streaming on Nest Hub. On the other hand, Microsoft just revamped its Teams app for better meetings during this ongoing pandemic.

Voice technology is now a household name, and there’s no denying it. With all the rapid developments, we are nothing but super excited to see how SoundSpace turns out!

Loading

0 Shares
You May Also Like