With ever-growing concerns about facial recognition systems that might be biased and end up targeting specific groups of people, IBM has decided to do something about it – the company plans to release over a million facial images in a data set which can be used by anyone to help train their AI facial recognition system.
The dataset is set to be as diverse as possible, containing images distributed equally across all ethnicities, ages, and genders as well as different skin colors and tones. This very diverse mix of faces should help an AI system to recognize the many differences human faces have between them and thus be able to overcome bias.
A while ago, a study by MIT Media labs scholar Joy Buolamwini showed that facial recognition systems did not perform for her as well as for her more fair-skinned friends. She published the “Gender Shades” paper, where she evaluated gender identifications systems from IBM but also Microsoft and Chinese company Mobvii.
The results showed that all three systems did not perform in favor of people of color and also that they performed even worse for women with darker skin, unable to identify them up to 34% of the time.
The scholar examined two of the most commonly used datasets and found both of them to contain a higher number of lighter skinned subjects, which in turn had led to an imbalance in the AI performance. In response, Bulolamwini decided to make her own dataset and gave access to it to both Microsoft and IBM, in order to improve facial recognition technology, which led us to the improved dataset that will soon be available.
On their blog post, IBM notes:
AI holds significant power to improve the way we live and work, but only if AI systems are developed and trained responsibly, and produce outcomes we trust. Making sure that the system is trained on balanced data, and rid of biases is critical to achieving such trust.
IBM also states that discrimination of any kind goes against their values so with that in mind, hopefully the researchers and engineers working on facial recognition technology will learn to be aware of all their users while also keeping an eye open for biases in the systems they help create which might not seem to be there at first glance.