A photo from IBM's Diversity in Faces dataset. Source: IBMA photo from IBM's Diversity in Faces dataset. Source: IBMAccording to reports, IBM trained its facial recognition systems on millions of online images unbeknownst to the people appearing in those images.

The images, which were not originally collected by the tech giant but instead by former Flickr owner Yahoo, were taken with the intention of building an image repository for research purposes. Called YFCC100M, the database is home to roughly 99 million photos from Flickr containing images of people who had consented to have their pictures taken but who did not know their images would eventually be used to train facial recognition algorithms.

Made available under Creative Commons licenses and approved by IBM’s legal team, the company is reportedly using the images to help build an unbiased facial recognition dataset called Diversity in Faces amid recent reports that the technology only accurately works on the faces of white people. IBM hopes that using the diverse images from the database will battle the bias that can make artificial intelligence algorithms unfair.

While it will be a challenge for people to determine if their photos were used to train the facial recognition technology algorithms as IBM keeps dataset details private, the company does assure that only academic researchers have access to the images.

"We take the privacy of individuals very seriously and have taken great care to comply with privacy principles, including limiting the Diversity in Faces dataset to publicly available image annotations and limiting the access of the dataset to verified researchers. Individuals can opt-out of this dataset," spokesman Saswato Das said in a statement. "IBM has been committed to building responsible, fair and trusted technologies for more than a century and believes it is critical to strive for fairness and accuracy in facial recognition."

At the heart of most controversies surrounding the use of facial recognition technology is that the technology could potentially be used to unfairly target specific individuals including immigrants, people of color and religious minorities. To prove that point, the American Civil Liberties Union (ACLU) compared images of U.S. Congress members against a database of public mug shots using Amazon’s facial recognition tool, Rekognition. The result, according to the ACLU, was that 28 members of U.S. Congress were falsely identified as criminal suspects, 40% of whom were African-American. Such reports prompted a coalition of advocacy groups and activists to write open letters to tech giants Google, Amazon and Microsoft, imploring them to not sell their respective facial recognition technologies to government authorities.

To contact the author of this article, email mdonlon@globalspec.com