A new report from Financial Times states that Microsoft’s database of photos of 10 million faces, faces that had been collected without consent, has been taken offline.
The imaged were published as a dataset called MS Celeb in 2016 and contained the faces of people who were considered celebrities, hence why the images were set under a Creative Commons license, which allows for re-use under certain conditions.
However, the dataset also contained images of private individuals such as security journalists, activists, academics, authors and more and, while the contents have been deleted, that doesn’t mean that MS Celeb isn’t still floating around online. If anything, it’s a bit like Schrodinger’s cat: it’s not there and yet it is, as its contents have been saved and are still being shared.
You can even find it on GitHub and all the information that came attached to it, such as labeling lists that have the names of the photo subjects can also be accessed quite easily.
“You can’t make a data set disappear. Once you post it, and people download it, it exists on hard drives all over the world,” Researcher Adam Harvey, who spearheads the Megapixels project, told Financial Times.
The images were used to train both military and commercial facial recognition software. According to Microsoft, the website “was intended for academic purposes” and had been “run by an employee that is no longer with Microsoft and has since been removed.”
Even so, the dataset has already been used by IBM, Panasonic, Alibaba, Nvidia, Hitachi, Sensetime and even Megvii, which has been attached to the Chinese’s state efforts to use facial recognition to track and opress ethnic minorities.
“Despite the recent termination of the msceleb.org website, the dataset still exists in several repositories on GitHub, the hard drives of countless researchers, and will likely continue to be used in research projects around the world,” Harvey went on to say “It’s fairly clear that Microsoft has lost control of their MS Celeb dataset and biometric data of nearly 100,000 individuals.”