Are All Explainable Models Trustworthy? Are All Explainable Models Trustworthy?
Explainable AI or Explainable Data Science is one of the top buzzwords of Data Science at the moment. Models that are... Are All Explainable Models Trustworthy?

Explainable AI or Explainable Data Science is one of the top buzzwords of Data Science at the moment. Models that are explainable are seen as the answer to many of recently recognized problems with machine learning, such as bias or data leaks.

[Related Article: The Importance of Explainable AI]

A frequently given reason to make models more explainable is that they will then be trusted more readily by users, and sometimes it appears people assume the ideas are almost synonymous. For example, the paper introducing the influential LIME method of explaining black-box models was titled ‘Why Should I Trust You?’, as if having an explanation for how a model came to its decision was a short direct step away from trusting it. Is this really the case, however?

An immediate problem with this equivalence is that trust is given at an emotional level whereas an explanation is a more technical artifact—the assumption behind explaining a model is that there are a certain number of pieces of information which can be provided to ensure the user understands what the model is doing. In contrast, to gain trust means crossing a number of emotional thresholds.

Hence, while it is true that an overly opaque model can be a huge obstacle to gaining a user’s trust, it isn’t the whole story—and there may even be occasions that an opaque model is trustworthy if some other conditions are met.

Firstly, let’s revisit some ways that a model explanation can be useful.

  1. Explainable models can be shown to subject matter experts allowing them to identify the model’s flaw.
  2. An explainable model may sometimes help us to find a way to get a better result—e.g. a model of survival time could potentially offer some clues on how to improve survival time, although the clues could be somewhat indirect
  3. It is more straightforward to troubleshoot an explainable model because its decision-making process is clearer.

Hence, there are reasons to make a model explainable that don’t immediately correspond to winning trust, although they may be a path to trust themselves. For example, incorporating a local subject matter expert’s opinion into a model may help them to trust it. At the same time, the goals listed above are clearly ends in themselves.

At the core, explainability and trust are different because trust is an emotional issue compared to the fact-based issue of explainability, which simply means that you can identify the effect that individual predictors have had on the model’s output. Because they correspond to different sets of problems, they require different approaches to solve them.

A first step for someone to trust a model will often be to begin to trust the person or organization presenting the model. That, in turn, will mean trying to understand not what the model is doing, but what that person or organization is doing—what do they hope to achieve by implementing this model?

In the ‘Trust Equation’ developed by Maister, Green, and Galfond and promoted in their book ‘The Trusted Advisor’ this corresponds to the self-orientation denominator that can derail the other positive factors.

More important than that, understanding the input factors for a model can open up a second wave of doubts around how the model if those input factors don’t conform to the model user’s view of the world. Sometimes this can be warranted—and this is why the step of using subject matter experts to validate the model’s explanation, referenced in passing above is important.

Once we get to present how the model’s input relates to its output, we begin to rub up against people’s cognitive biases and pre-conceived ideas. Rightly or wrongly, people often won’t readily trust your model if it contradicts their existing ideas. To overcome these objections, you need to research both what their ideas are, and the kinds of cognitive biases that will stop them from accepting alternatives.

Buster Benson’s ‘Cognitive Bias Cheat Sheet’ provides a useful map of some of the most important cognitive biases. This article summarizes and explains cognitive biases according to the problem they are trying to address.

In general, models are summaries and simplifications of what is happening, so some information needs to be left out. A human with their own preconceived idea of what is happening will soon find that what they want to see in the model is left out.

Going the other way, if every possible variable—and interaction—is added to the model, it will quickly become too complicated for a human to understand, so a balance needs to be struck, and ultimately the users of the models need to take some things on faith. At least they need to convince themselves that the small details they think are missing or incorrect don’t compromise the high-level picture.

There are a few ingredients needed to overcome these trust issues.

Firstly, your organization and its representatives have to engender trust, by embodying the behaviors the trust equation implies are seen as trustworthy. Without this initial foundation of trust, users won’t engage with your model and other efforts to begin with.

Next, your models need to be sufficiently open and explainable to allow users to engage with them. How open this is will depend on the users and their context. There will be contexts where users will care almost solely about the model’s accuracy. In other contexts, the users will be highly interested in the detailed aspects of how the model’s inputs relate to the outputs. This may go beyond the model itself, and move into a need for ancillary visualizations and data to help users decide for themselves what is going on.

Next, you need to build a model that doesn’t add to their cognitive biases. For example, if there is an understanding among SMEs that a particular variable has a monotonic effect on the target variable in a specific direction, that’s how it needs to be in the model, unless there is a good explanation backed by a Subject Matter Expert others look up to.

[Related Article: Explainable AI: From Prediction To Understanding]

The key is that there isn’t a silver bullet or royal road that leads straight from an ‘explainable’ model to users trusting and using your model. Instead it’s an inherently messy and iterative process.

Originally Posted Here

Robert De Graaf

Robert de Graaf

Robert de Graaf began his career as an engineer, but switched to data science after discovering the power of statistics to solve real world problems. He is Senior Data Scientist at RightShip and a founding partner of OutputAI Labs. He is the author of many articles on data science, focusing on techniques which ensure you tackle the right problem, as well as being the author of the book 'Managing Your Data Science Projects (Apress)'.