LIME is a crucial machine learning tool that can tackle one of the biggest issues in machine learning is the issue of interpretability. You can think of interpretability as explaining how and why a model makes predictions. In this age of the super black box model, it may be worth to opt for models that don’t produce perfection predictions but can be understood.
Why opt for imperfect models? Decision trees and logistic regression won’t ever beat XGBoost or a deep neural net in terms of accuracy but they are much simpler to understand. Decision trees visualize predictions with an actual decision tree while linear models produce coefficients thus explaining the relationship between features and the outcome.
The simplicity in understanding decision trees and logistic regressions can significantly help when it comes to explaining insights and model construction to business decision makers. Organizations aren’t going make huge decisions based on models that they don’t entirely understand But the power offered in XGBoost and neural nets can provide insights that decision trees and regression models cannot. So, how can we create simple explanations for complex models?
This is where LIME comes in
LIME is an amazing machine learning tool that has the power to change the way you conduct machine learning projects. It isn’t, however, a machine learning package or library like Sci-kit Learn. LIME’s purpose is to explain and interpret machine learning models such as neural networks, XGBoost and others.
In the below section, I’ve posted screenshots of my Jupyter Notebook in which I demonstrate how to use LIME on this Spotify dataset. I created this model to train a machine learning algorithm to try to predict whether or not I like a song. If I you have the time, I recommend doing a similar project based on your songs that you’ve tagged as ones you like or don’t like. This will allow you to understand on a first hand basis how powerful LIME is in these types of projects.
For a data science educator, the concept of “interpretability” is extremely important. Business leaders and students alike ask questions such as “why are we using these models?” or “hy not just go with the supermodels every time?”
The one-word answer to those questions is “interpretability.” It is easier to interpret and explain simpler models. A decision tree literally illustrates how it makes a prediction while and the coefficients from a linear model quantify the relationship between the features and the target variable. Both decision trees and linear models provide direct visualizations of the relationships between inputs and outputs. These are things are you can’t really do with ensemble models and neural nets. These models are comprised of either hundreds of decision tree models or 10,000s of coefficients respectively, making them consist of too many layers to efficiently examine and understand in a short amount of time to garner any true insights. Yet these models offer a power and precision that cannot be found in decision trees or regression models. Hence why LIME is such a powerful tool to have in your machine learning arsenal. It not only helps you get better at machine learning but also enables you as a data scientist to explain complex algorithms and models in a very concise and clear manner to business decision makers and non-technical audiences alike. On your next project, give LIME a try.
LIME Demonstration in Juypter Notebook with Spotify Dataset