The Accuracy vs Interpretability Trade-off Is a Lie

Why, if we look at the bigger picture, black-box models are not more accurate

Photo by Nathan Cima on Unsplash

When I started as a data scientist, I was expecting to use state-of-the-art models. XGBoost, Neural Networks. These things are complex and interesting and surely they would drive improvements. Little did I know, the models faced a hurdle — explaining them to other people.

Who’d have thought you need to understand the decisions your automated systems make?

To my joy, I stumbled down the rabbit hole of model agnostic methods. With these, I could have the best of both worlds. I could train black box models and then explain them using methods like SHAP, LIME, PDPs, ALEs and Friedman’s H-stat. We no longer need to trade accuracy for interpretability!

Not so fast. That thinking is flawed.

In our pursuit of best performance, we often miss the point of machine learning: that is, to make accurate predictions on new unseen data. Let’s discuss why complex models are not always the best way of achieving this. Even if we can explain them using other methods.

What is the accuracy vs interpretability trade-off?