The First Step Toward Fixing Bias In AI Systems
This article highlights the importance of addressing systemic factors contributing to bias in AI models and calls for proactive measures in optimizing fairness.
“Life” – she stopped here for dramatic effect and then spoke slowly – “is very unfair.”
I was wondering how an ordinary day at preschool brought my 4-year-old daughter to independently reach the conclusion I reached long ago. So I asked.
“It’s Sebastian.”
Sebastian was one of my daughter’s friends and – to the best of my knowledge – neither the victim nor perpetrator of a great injustice. So I asked her why.
My daughter went on to explain that they practiced writing their own names in preschool that day. Her name is Ori, which is not only three letters long, but also has an "O" and an "I," which are basically just a circle and a straight line. It’s really just the letter R you have to worry about.
Sebastian, on the other hand, has a much longer name. During lunchtime, with the help of the teacher, the kids figured out that his name is three times longer than Ori’s. This took a while to calculate since not all the kids could consistently count to nine yet.
They discussed potential solutions, like name changes or forbidding long names, but didn’t reach a conclusion. Lunch ended in a brooding mood.
This memory comes to me often around discussions on bias in AI. The problem is very real. Google Photos labeled black people as gorillas. Speech recognition can have a gender bias and an accent bias. Amazon scrapped an AI recruiting tool that showed bias against women. One organization reported that the software used to predict future criminals is biased against black people. The list goes on.
A deeper problem is that this discussion is often framed as bad data science or bad data management practices, or the problem is blamed on irresponsible data scientists. While calls for data scientists to learn more about ethics and civics are honorable (hey, everyone should do it), they ignore the fact that a trained model is a mathematical reflection of its training data. While there are things that a data science team should know -- this paper about fairness metrics and this paper about cognitive biases are great examples -- I don't think it's up to data scientists to “fix fairness” that the system they’re modeling doesn’t have. To put in simply:
Blaming an AI model because it learned bias is exactly like blaming your mirror because it makes you look fat.
The story about Amazon’s AI recruiting tool isn’t about an evil data science team that hates women. It’s about a model that learned not to invite women to interviews. This is a much bigger story that I believe suggests people (still) systematically hinder women’s employment chances. Similarly, finding out that a criminal defendant’s race is a big factor in their sentencing is much bigger than a story on AI proof of concept.
Step one: Accept what the mirror shows you. If you train a model to optimize the likelihood that candidates sent to interviews will pass them, and the model learns to avoid sending women for interviews, then it’s a good idea to reflect on the culture, processes and systems you’ve put in place to make it so.
You can’t change the model to solve this. Changing your mirror to make you look thin won’t work.
What you can do is tell your data science team what to optimize. For example, for your resume selection AI, you can optimize the likelihood that selected candidates will pass their interviews. You can optimize for selecting candidates that will have superstar performance reviews in their first year. You can also optimize for the fastest time to fill each open position -- under the condition that the same proportion of men and women will pass their interviews, or under the condition that the same number of men and women will.
What you cannot do is ask your data science team to optimize for one goal and then ask them why they don't also achieve other goals. Good data scientists know how to optimize for specific results really well -- including fairness. The work usually involves additional data collection, resampling and evaluation of training sets, and training predictive models to optimize for the right metrics. But data scientists should not make the product management decision of what to optimize -- just like a construction worker shouldn’t deviate from the architect’s plan even if they believe it would make the building more welcoming.
Don’t wait for an “AI fairness platform” to solve it for you. Just decide what you’re building. AI can be your opportunity to change that culture: By telling your data science team to optimize different metrics than your current processes, you can change deep-rooted biases at scale.
For example, we can train a model to proactively send more female resumes to interviews because they may be less likely to get hired due to inherent biases in the interview process (via BBC). We can train a model to parole more drug addicts if we take partial responsibility for the environment that made them criminals to begin with. We can train models to raise credit scores for racial minority groups, to begin correcting how many believe access to credit has perpetuated racial injustice to date.
This requires you, as a business leader, to clearly communicate priorities. Hiring more women can be as important as hiring quickly. Giving addicts a second chance can take priority over minimizing crime. Fair access to credit can come before minimizing default rates. These are the hard choices -- and the real cost -- of getting rid of bias. It sure was easier when we treated it as a bug in the AI code, but it’s time to put that thinking behind us.
Until that happens, we’re a lot like a group of 4-year-olds at lunch. Our hearts are in the right place, but many of us can’t even measure the issue, so we may overestimate how hard it is to fix the tactical problem at hand and underestimate how difficult it will be to handle its root cause.