Cooking good food doesn’t only take time and skill and a pinch of practice but also a good sense of taste. Creating and re-creating foods is probably one of the most interesting skills we humans have come up with.
With that in mind, could a robot ever learn to cook like a human?
MIT thinks so and it has even published a new study titled “How to make a pizza: Learning a compositional layer-based GAN model“, which explores how machine learning can be trained to ‘look’ at a dish and then come up with a step-by-step guide on how to re-create it.
The project has been dubbed PizzaGAN (GAN stands for Generative Adversarial Network) and it uses, of course, pizza, to demonstrate the network’s abilities.
Because pizza is made by layering different types of ingredients, the researchers taught the machine how to dissect images of pizza in order to identify the different ingredients.
The researchers created a dataset of 5,500 images of pizza taken from a clip art batch. Next, they spent some quality time creeping around the #pizza hashtag on Instagram for some real-life examples that maxed out at 9,213 images.
These images were then fed to the machine and the researchers trained it how to add or subtract every individual ingredient, and to subsequently create a synthesized image.
They then created a different model that detected the toppings the machine could see and then could predict the order they were sprinkled over the pizza crust during the cooking process by calculating depth.
The MIT researchers received their best results from the synthetic dataset but the overall results were pretty accurate, according to them.
The idea behind the project is that, after some tweaks, the networks could simply scan a photo of a dish and then deliver an accurate recipe with step-by-step instructions based only on what it can see.
The team behind PizzaGAN added that the networks could be used in completely different contexts as well.
“Though we have evaluated our model only in the context of pizza, we believe that a similar approach is promising for other types of foods that are naturally layered such as burgers, sandwiches, and salads,” the MIT team said. “It will be interesting to see how our model performs on domains such as digital fashion shopping assistants, where a key operation is the virtual combination of different layers of clothes.”
At its core, the research shows us that we can make AI tell the difference between what it might initially see as just a pile of random ingredients and a delicious pizza, with all that a pizza actually entails.