Explainable AI Is An Interaction Design Problem
In this thought-provoking article, we'll explores the notion of explainable AI (XAI) and its relationship to user interaction design, providin practical recommendations for AI projects.
Explainable AI has been a hot topic for a while now. Public outrage has rightfully risen about gender and racial bias in facial recognition systems marketed to law enforcement, as well as the lack of transparency in systems used for bail and sentencing decisions in the criminal justice system. Last year, Harvard business review summarized calls for "a complete ban on using 'non-explainable algorithms' in high-impact areas such as health." In 2018, the digital sector's minister of state in France said that if an algorithm cannot be explained or summarized in detail, it shouldn't be utilized.
The data science community has been busy at work to provide technical solutions to this challenge – from the LIME algorithm and its open-source package to startups that provide explainable AI (XAI) platforms, including simMachines, FICO and Kyndi. However, my own experience building such systems and reviewing systems in use today suggests a different direction – because “explaining an AI algorithm” and “building user trust with perceived transparency” are totally separate goals. It seems that successful explainable AI is a user-interaction design issue – that is, a product design question and not a technical one.
How Recommender Systems Explain Themselves
One of the initial widely used consumer examples of recommender systems was “people who bought this also bought that.” As early as 2002, a study on five music-recommendation tools showed that average confidence increased in systems that came off as more transparent. Telling users that a song (or product) was recommend because they already liked another one made them more likely to try it. Although, that had little to do with how the algorithm actually worked. Amazon acts similarly when providing “transparency” in recommendations. The goal is to make the user take action, not uncover the inner workings of its algorithms (which are not just complex, but also trade secrets).
Netflix has published a handful of academic papers about its recommendation systems, which describe the same pattern: ranking models providing “because you watched” recommendations are driven by different algorithms than Netflix's other models. Finding the “most similar watched show” to a recommended show is a separate machine-learning task from ranking the shows in the first place. The goal isn’t to explain, but rather to make the user more likely to watch the show due to perceived transparency.
Examples In Healthcare and Law
My experience in health care has been strikingly similar. Hospital systems often stick with simple score-based models for patient risk prediction – sometimes knowingly trading off accuracy for simplicity – so that clinicians can understand the predictions. This makes sense because a doctor just won’t follow a computer’s recommendation to double a drug’s dosage for their patient (for example) without fully understanding why.
However, when building more complex (and more accurate) models, I’ve found that it is also the case that doctors don’t want the system to explain the actual model being used. They want other models to be built to answer specific questions they care about: What other conditions (comorbidities) does the patient have? What are their most recent changed vitals or lab tests? This is regardless of whether the model actually directly relies on these features or not. These questions usually require building new models that are different from the original one, sometimes trained by different data scientists altogether. The goal is not to explain a prediction, but to show whatever user interface is necessary to nudge users to take action.
This is already the standard we use for explaining decisions elsewhere in life. The pinnacle of (perceived) transparency – the legal system – is the most obvious example. A judge doesn’t write a verdict that discloses how their emotional, political and societal background impacted who they believed. They definitely don’t disclose that they are seven times more likely to show mercy just after they ate. Instead, they explain to achieve an outcome: minimizing the likelihood of a successful appeal.
Explaining Your Own AI Models
What does it all mean? That you should consider these recommendations for your own project:
1. When designing your AI product or service, consider how the user interface and interaction design will best instill confidence and perceived transparency. How do you make users take the right action and feel good about it?
2. When training an AI model, build the most accurate models that you can, regardless of perceived complexity. You may then have to train other models to provide the right kind of explanation that your product requires — or train separate models for important segments of your target user base (by gender, race, geography, specialty, genre, etc.).
3. AI systems that have inherent biases that anger users, or that include models that users do not trust, are, first of all, a failure of product management and interaction design. Once you understand what your users value — technical solutions to explain your AI already exists — apply it.
We are obviously in the early days of understanding how AI systems and people should interact. It's likely that the industry's views and my own will evolve over time (I hope they do!). What have you seen actually working in practice when it comes to explainable AI?