MIT Creates A.I That Identifies Objects Via Verbal Descriptions

Credit: Christine Daniloff / MIT

Computer scientists at MIT managed to develop a system that can identify objects within an image, based on a spoken description. The system will highlight the parts of the image it finds relevant to the description, in real time.

It learns the words from recorded speech clips and objects in raw images and then, associates them with one another.

The team modified a pre-existing image handling neural network, making it split the image in a grid of cells. The audio network then cuts it up into 1 or 2 second snippets. After the image and the right caption are paired, the training process score the AI system on its performance. If it sounds a lot like teaching a child what objects are by pointing at and naming them, you’re not too far off.

We wanted to do speech recognition in a way that’s more natural, leveraging additional signals and information that humans have the benefit of using, but that machine learning algorithms don’t typically have access to. We got the idea of training a model in a manner similar to walking a child through the world and narrating what you’re seeing,”

David Harwath, researcher in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Spoken Language Systems Group

Florian Metze, an associate research professor at the Language Technologies Institute at Carnegie Mellon University says about the A.I that

It is exciting to see that neural methods are now also able to associate image elements with audio segments, without requiring text as an intermediary. This is not human-like learning; it’s based entirely on correlations, without any feedback, but it might help us understand how shared representations might be formed from audio and visual cues.”

The A.I can be used in numerous ways but the MIT researchers have set their eyes on improving translation.

Follow TechTheLead on Google News to get the news first.

Subscribe to our website and stay in touch with the latest news in technology.

Must Read

Are you looking for the latest innovations in tech? You're in the right place, just subscribe to our RSS feed

Techthelead Romania     Comedy Store

Copyright © 2016 - 2023 - SRL

To Top