Mine Like Amazon with Market Basket Analysis Mine Like Amazon with Market Basket Analysis
Pattern mining is an incredibly simple but powerful technique for discovering cooccurrences in large datasets. The most common approach to find... Mine Like Amazon with Market Basket Analysis

Pattern mining is an incredibly simple but powerful technique for discovering cooccurrences in large datasets. The most common approach to find those patterns is Market Basket Analysis, which is frequently pointed out as the method Amazon leverages for their “users also purchased” feature.

Of course, that’s a dramatic oversimplification. Amazon certainly didn’t build a retail empire off of a single algorithm churning out recommendations; their feature almost certainly uses other statistical factors as well. However, Market Basket Analysis is still incredibly useful for practitioners, especially when it comes to evaluating qualitative data.

That is a huge value add for MBA. Most data scientists are comfortable sticking to numerical datasets, which is to be expected since the majority of problems we face regularly can be reduced to numerical solutions. Natural language processing is the closest to qualitative data we usually get, but even then we typically deploy machine learning anyhow. Methods like MBA can open a whole world of other possibilities for practitioners looking to expand their toolbox.

Without further ado, this is Market Basket Analysis and how to use it in the field.

Down to Brass Tacks

I’ve already mentioned that Market Basket Analysis is stupid simple. It really is: you’re effectively just looking at the likelihood of different elements occurring together. There’s more to it than that, but that’s the basis of this technique. We’re really just interested in learning how often things go together and how to predict when things will go together.

Imagine we had a collection of baskets with the following items:

  • Eggs, butter, bread
  • Eggs, bread, jam
  • Bread, butter, apples
  • Apples, eggs
  • Eggs

Our goal is to figure out which items predict the purchase of other items. Or, if we want to generalize from our example, we’re interested in what combinations of characteristics predict the presence of other characteristics.

Start by breaking out your baskets into a flat co-occurrence matrix.

Eggs Butter Bread Jam Apples
Basket One X X X
Basket Two X X X
Basket Three X X X
Basket Four X X
Basket Five X

From this matrix, we can determine the likelihood of observing any pair of characteristics in the same example. Say we calculate it for butter and bread:

P(Butter, Bread) = (#Bread and Butter Baskets) / Total = 2/5

Pretty straightforward, right? The probability of co-occurrence for two items is just the fraction of examples that had both. In the context of MBA, we call this our support.

That statistic is pretty useful in and of itself. But it would be even more helpful if we knew how many transactions in which we had bread, we also had butter. In other words, we’re gathering information on how the purchase of bread is correlated with purchase of butter. This is called our confidence.

Conf(Bread Butter) = P(Butter,Bread) / P(Bread) = 2/3

In other words, if we purchase bread, chances are that we will also purchase butter. We now have an association rule.

Let’s do one better. What if we can actually evaluate how much better our rules are at predicting the outcome than if we had just guessed based on the supports? This idea is called lift, and it can be thought of as how ‘valid’ an association rule is.

Lift(Set X Set Y) = Confidence(XY)P(X)P(Y)

Lift is just the ratio of confidence to expected confidence — or, the likelihood of observing all items together given our rule, versus the probability of observing each item in the same basket as if there was no association between them. Lifts above one indicate that our rule is better at predicting the outcome, while a lift below one means the first item actually mitigates instances of the latter.

What does that look like for our bread and butter?

Lift(Bread Butter) = Confidence(Bread Butter)P(Bread)P(Butter) (2/3)(3/5) (2/5) =2.7

Turns out that our association rule does a pretty good job at determining whether someone will buy butter, given that they’ve bought bread.


Stupid Simple, Stupid Useful

Even though MBA is an extremely straightforward technique to use, its insights are powerful and can be applied to an incredible variety of data. Most notably, you can launch MBA against qualitative data that isn’t easily summed up with quantitative methods. For social scientists, psychologists, and demographers with the luxury of large datasets, this can be very powerful.

Market Basket Analysis can open new horizons for your analyses if you’re clever. Keep your eyes peeled for datasets that don’t neatly fit into our normal machine learning approaches and see if you can apply MBA to find some surprises.

Ready to learn more data science skills and techniques in-person? Register for ODSC West this October 31 – November 3 now and hear from world-renowned names in data science and artificial intelligence!

Spencer Norris, ODSC

Spencer Norris is a data scientist and freelance journalist. He currently works as a contractor and publishes on his blog on Medium: https://medium.com/@spencernorris