LIME Can Make You Better at Machine Learning LIME Can Make You Better at Machine Learning
LIME is a crucial machine learning tool that can tackle one of the biggest issues in machine learning is the issue of interpretability. You... LIME Can Make You Better at Machine Learning

LIME is a crucial machine learning tool that can tackle one of the biggest issues in machine learning is the issue of interpretability. You can think of interpretability as explaining how and why a model makes predictions. In this age of the super black box model, it may be worth to opt for models that don’t produce perfection predictions but can be understood.

Why opt for imperfect models? Decision trees and logistic regression won’t ever beat XGBoost or a deep neural net in terms of accuracy but they are much simpler to understand.  Decision trees visualize predictions with an actual decision tree while linear models produce coefficients thus explaining the relationship between features and the outcome.

The simplicity in understanding decision trees and logistic regressions can significantly help when it comes to explaining insights and model construction to business decision makers.  Organizations aren’t going make huge decisions based on models that they don’t entirely understand But the power offered in XGBoost and neural nets can provide insights that decision trees and regression models cannot. So, how can we create simple explanations for complex models?

This is where LIME comes in

LIME is an amazing machine learning tool that has the power to change the way you conduct machine learning projects. It isn’t, however, a machine learning package or library like Sci-kit Learn. LIME’s purpose is to explain and interpret machine learning models such as neural networks, XGBoost and others.

In the below section, I’ve posted screenshots of my Jupyter Notebook  in which I demonstrate how to use LIME on this Spotify dataset. I created this model to train a machine learning algorithm to try to predict whether or not I like a song. If I you have the time, I recommend doing a similar project based on your songs that you’ve tagged as ones you like or don’t like. This will allow you to understand on a first hand basis how powerful LIME is in these types of projects.

For a data science educator, the concept of “interpretability” is extremely important. Business leaders and students alike ask questions such as “why are we using these models?” or “hy not just go with the supermodels every time?”

The one-word answer to those questions is “interpretability.” It is easier to interpret and explain simpler models. A decision tree literally illustrates how it makes a prediction while and the coefficients from a linear model quantify the relationship between the features and the target variable. Both decision trees and linear models provide direct visualizations of the relationships between inputs and outputs. These are things are you can’t really do with ensemble models and neural nets. These models are comprised of either hundreds of decision tree models or 10,000s of coefficients respectively, making them consist of too many layers to efficiently examine and understand in a short amount of time to garner any true insights.  Yet these models offer a power and precision that cannot be found in decision trees or regression models. Hence why LIME is such a powerful tool to have in your machine learning arsenal. It not only helps you get better at machine learning but also enables you as a data scientist to explain complex algorithms and models in a very concise and clear manner to business decision makers and non-technical audiences alike. On your next project, give LIME a try.

LIME Demonstration in Juypter Notebook with Spotify Dataset


George McIntire, ODSC

George McIntire, ODSC

I'm a journalist turned data scientist/journalist hybrid. Looking for opportunities in data science and/or journalism. Impossibly curious and passionate about learning new things. Before completing the Metis Data Science Bootcamp, I worked as a freelance journalist in San Francisco for Vice, Salon, SF Weekly, San Francisco Magazine, and more. I've referred to myself as a 'Swiss-Army knife' journalist and have written about a variety of topics ranging from tech to music to politics. Before getting into journalism, I graduated from Occidental College with a Bachelor of Arts in Economics. I chose to do the Metis Data Science Bootcamp to pursue my goal of using data science in journalism, which inspired me to focus my final project on being able to better understand the problem of police-related violence in America. Here is the repo with my code and presentation for my final project:

Open Data Science - Your News Source for AI, Machine Learning & more