Solving Merchant Attrition Using Machine Learning Solving Merchant Attrition Using Machine Learning
  Merchant attrition is a major problem not just in the financial industry but in any sector (substitute ‘merchants’ with ‘customers’... Solving Merchant Attrition Using Machine Learning


Merchant attrition is a major problem not just in the financial industry but in any sector (substitute ‘merchants’ with ‘customers’ in a given business). The difficulty lies in identifying churn-probable merchants. According to a recent article from Womply Insights and Goldman Sachs, “Merchant acquirers lose an estimated $2 billion per year due to merchant attrition and spends more than $1 billion per year acquiring merchants to replace the ones they have lost.”

Based on recent trends, many companies do not have any predictive model to measure when the merchants would leave. They use a traditional approach, which is to use customer service to retain the merchants. Everything is measured retrospectively, meaning they use retention plans to retain merchants only after they leave the organization.

Modern Approach

Data is everywhere, and companies collect user data every second. Using big data and new age technologies like machine learning, we can solve the merchant attrition problem efficiently beforehand. This can be considered as a supervised learning classification problem – where we use historical data to train our model based on its patterns and create a model that probabilistically identifies the merchants who would be leaving the organization. We can run this model periodically, such as on a monthly or quarterly basis to identify the churn probable merchants and endorse retention plans to these merchants.

It’s not just about solving the problem; one of the most important things to keep in mind is to make sure that the problem does not occur again. This can happen only if we dive deep and understand the root cause of the problem. In a problem like merchant attrition, it is crucial to understand the underlying factors that are influencing the churn. There are some powerful machine learning algorithms like Random Forest which can be used to derive the factors influencing the model. These factors are in turn the causes of attrition in our problem. Now, let’s look at a recommended framework in detail to understand how this can be achieved using machine learning. 

Recommended Churn Analysis Framework

Machine learning can be an effective tool to solve the merchant attrition problem by identifying churn probable merchants and listing out factors influencing the churn rate. The goal is to increase the retention of merchants and reduce the merchant attrition rate. Bitwise recommends using CRISP (Cross Industry Standard Process for Data Mining) in combination with a sophisticated Bag of Models process which uses the following phases. 

Business Understanding:

  • The first stage in CRISP-DM process is to understand the objective or problem statement that you are trying to solve.
  • Set objectives – describe the customer’s primary objective from a business perspective. 
  • Business success criteria – describe the criteria for a successful outcome to the project from the business point of view. This could be specific and measurable; in our case, prediction of customer churn probability with certain confidence interval.

Data Understanding:

  • Most companies have a Data Dictionary available. It is essential to have an understanding of all the fields collected for the project. 

Data Preparation:

  • Data cleaning – This is the most time-consuming step in the entire process. Each and every field from the data needs to be investigated and checked for inconsistencies. The following are some of the things to check for or perform on the data.
    • Missing Values
    • Outliers
    • Categorical to Numerical – one-hot encoding
    • Standardization or Normalization 
    • Feature Engineering


  • Model Building – Start the model building process by choosing the appropriate algorithm based on the business use case. Build multiple models with different algorithms and choose the model that performs the best.
  • Hyperparameter tuning – Once the best model is chosen, try to optimize the algorithm by fine-tuning the parameters that are available to increase the performance of the model.  


  • Assessment of data mining results – Summarize assessment results in terms of business success criteria, including a final statement regarding whether the project already meets the initial business objectives.


  • Deployment plan – Summarize your deployment strategy including the necessary steps and how to perform them.
  • Monitoring and maintenance plan – Summarize the monitoring and maintenance strategy, including the necessary steps and how to perform them.

Recommended Attributes 

It is extremely important to select the appropriate attributes for the model. Otherwise, you might end up using features that are not significant to the model, due to which the complexity of the model may increase, and it would become computationally heavy. Bitwise recommends the following attributes for the model building process for merchant attrition.

  • Merchant transactional data
  • Demographic data
  • Pricing information
  • Merchant account information

Guidelines for Success – Bag of Models Approach

When helping our clients better predict merchant churn, Bitwise uses a Bag of Models approach as outlined below.


  • Collect all required data and perform preprocessing, the steps involve Data Cleaning and Feature Engineering. 
  • The preprocessed data is then fed to multiple machine learning algorithms. The model makes predictions based on its parameters. 
  • These predictions are then evaluated using evaluation metrics. The evaluation metrics for classification algorithms are Accuracy, Recall, Precision, F1 score. 
  • Based on these evaluation metrics the best performing model is selected, which is called Champion model and the 2nd best model is called Challenger. 
  • Use Champion model predictions as the output. 
  • The process is repeated on a monthly basis and predictions are made. Based on the metrics, if at any point the Challenger model performs better than the Champion model, use the Challenger model predictions as the output.

Results Achieved

For one of our clients in the payments sector, we achieved the following results using the process outlined above.

  • Predicted the churn probability with 93.76% accuracy (using Logistic Regression model).
  • The client was able to endorse retention plans on targeted churn probable merchants, which helped lower the attrition rate by 6%.

Rather than trying to retain merchants after they have already made the decision to leave the organization, merchant acquirers are looking to use new technology to make better predictions and take proactive measures to keep the customer. Machine learning can help organizations make these predictions and deliver value to the business.

Machine learning provides the ability to predict the churn probability of each merchant with certain accuracy or a threshold that the business agrees upon. These individual probabilities of merchants can be used by the retention team to endorse retention plans while the merchant is still with the organization. This will ultimately help in lowering the attrition rate of the organization without any loss.

Where to Start

There are many options to implement machine learning to solve the merchant attrition problem. Bitwise recommends working with a partner that offers industry experience and a proven approach to developing machine learning solutions.

Bitwise offers a variety of materials and case studies to provide a clear understanding of the best methods for using machine learning to deliver results. Visit www.bitwiseglobal.com to learn more.

About the authors:

Girish Kapur

Girish is Director of Enterprise Strategy & Analytics and Solutions Engineering at Bitwise and is responsible for shaping and growing the Analytics Data Engineering services as the business with differentiated go to market strategy. He has successfully aligned enterprise strategy, cloud, big data, integration, and our domain depth to enable digital footprint that spans across Advance Analytics, Data Governance, AI, ML and Intelligent Automation.



Bharat Prasad


Bharat is Director of Big Data and Cloud Computing Strategy and Solutions at Bitwise with extensive experience in architecting and developing scalable BI, Big Data, and Cloud Computing solutions. Bharat helps enterprises drive value by defining strategies that align with key business and technical objectives.




Nathan Nickels


As Head of Partnerships at Bitwise, Nathan is committed to helping IT and business leaders to bridge gaps between traditional BI and modern technologies, including big data and cloud tools. Nathan works with Bitwise partners to coordinate solutions that best meet customer requirements while minimizing costs and driving business value.


ODSC Community

The Open Data Science community is passionate and diverse, and we always welcome contributions from data science professionals! All of the articles under this profile are from our community, with individual authors mentioned in the text itself.