Leading facial recognition algorithms are right 99 percent of the time — for white men, that is. The software performs worse the darker your skin is, with the gender of darker skinned women being misidentified 35 percent of the time. While the New York Times recently reported on this, the problem isn’t new.
In 2015, Google had to apologize after its facial recognition software tagged African-American males as gorillas. Three years later, Google has still not fixed the problem. Search for “gorilla” in Google Photos and you’ll get “No results,” as reported by Wired.
And last fall, Camille Eddy, then a student at Boise State University and a machine learning intern at HP, detailed several failures of AI, namely its inability to respond to her own black skin and other inherently biased aspects of the very tech she was learning to code.
“People of color would be invisible or misidentified,” Eddy said, describing her experience with computer vision in a keynote address at the Non-Profit Technology Conference (17NTC). “What happens when a segment of the population becomes invisible to the technology in use?” she asked. Eddy’s experience with robots and computer vision has implications for broader applications of AI.
More recently, M.I.T. Media Lab’s Joy Buolamwini, a researcher at the M.I.T. Media Lab, found fault with image datasets used by the likes of Microsoft, IBM, and Face ++. Buolamwini’s paper, Gender Shades, reported that these facial recognition systems misidentified gender in 35 percent of darker-skinned females and in up to 12 percent of darker-skinned males, among other disparities.
The cause? Well, in these instances, the machine learning algorithms tested by Buolamwini are dependent on the data they’re fed. Another study cited by the New York Times describes a common facial-recognition data set as being more than 75 percent male and more than 80 percent white.
Similar skewing of results can be found with facial recognition using driver license photos and mugshot databases containing a disproportionate number of African-Americans. According to a Georgetown Law School study, “Police face recognition will disproportionately affect African Americans.”
It’s time for the AI community to address these implicit biases of machine learning and help end any reliance on flawed algorithms. Advances in AI should benefit all populations, light- and dark-skinned alike.
Swapping in new, more racially representative data is an obvious place to start. But that’s just a first step. Buolamwini has called the failure of facial recognition “an invitation and a mission to create a world where inclusion matters and works for all of us.”
For Eddy, addressing facial recognition’s failures means asking, who is developing it, and who is using it? If the answer to both is ‘predominantly white males,’ then little will change, she warns. Eddy believes that increased diversity will help spot cultural bias early on, but cautions against hoping for an overnight cure. “We can’t fix it, but we can make it better.”
UPDATE: Tweets from the Fairness, Accountability and Transparency Conferenceon February 24, 2018 recount that IBM was able to replicate Buolamwini’s findings in Gender Shades, and issued a new, more accurate API.
.@jovialjoy sent a pre-print of their paper to the companies after #FAT2018 acceptance. Face++ didn’t respond. MS sent a response. IBM had the best response — replicated the paper internally and released a new API yesterday. Wow. New API classified darker females at 96.5%
— Alex Hanna, Critical Data Witch (@alexhanna) February 24, 2018
I have CHILLS. @jovialjoy got @IBM to do an internal review, leading them to significantly improve the performance of their face recognition algorithm on POC (particularly women). THIS is why we’re here! #FAT2018 #GenderShades
— Colin Hill (@ceh0) February 24, 2018