The Intuition Behind Uplift Modeling
Businesses pay a lot of attention to the estimated return on investment from the marketing efforts (e.g. promotions, communications) they employ. Insights into the impact of marketing on their customers are critical to optimizing ROI. A/B testing is an approach well-understood and commonly applied to measure the impact of... Read more
Improving Data Quality for Superior Results
When you’re a data scientist, you see a problem, and you build a model to solve it. If it’s not as accurate as you were hoping, you tweak the model. But what if it’s the quality of your data causing skewed or flawed results? Kaitlin Andryauskas of Wayfair, wants... Read more
Introduction to GPT-3
Natural Language Processing (NLP) has become the darling of the deep learning community in the past several years and is now an accelerating area of research. There have been significant gains over this time with many NLP tasks and benchmarks going through a two-step process: training with a number... Read more
Could Your Machine Learning Model Survive the Crisis: Monitoring, Diagnosis, and Mitigation Part 1
As the world is changing rapidly around us, it is often questionable whether something we learned from the past is still valid. Machine learning models that make predictions of the future based on past data points are probably under most scrutiny from businesses in the current climate. Close monitoring... Read more
Building a Production-Level Data Pipeline Using Kedro
Suppose you are a self-taught data scientist who does not have much experience in software development. One morning, your senior executive asks you to provide an ad-hoc analysis – perks of the job, and when you do, she thanks you for delivering useful insights for her planning. Great! Three... Read more
10 Compelling Machine Learning Ph.D. Dissertations for 2020
As a data scientist, an integral part of my work in the field revolves around keeping current with research coming out of academia. I frequently scour arXiv.org for late-breaking papers that show trends and reveal fertile areas of research. Other sources of valuable research developments are in the form... Read more
Why Causation Matters in Data Science
Inferring causality is vital to deriving actionable insights in product data science, similar to more established fields like public policy. Without understanding the causal impact, we cannot make influential product changes that will alter outcomes or behaviors in-line with product or policy goals. In my experience, because of an... Read more
An Intro to Gradual Magnitude Pruning (GMP)
Welcome to Part 2 in Neural Magic’s five-part blog series on pruning in machine learning. In case you missed it, Part 1 gave a pruning overview, detailed the difference between structured vs. unstructured pruning, and described commonly used algorithms, including GMP. Few algorithms are better than GMP in overall results, and none beat... Read more
How to Explain Your ML Models?
Explainability in machine learning (ML) and artificial intelligence (AI) is becoming increasingly important. With the increased demand for explanations and the number of new approaches out there, it could be difficult to know where to start. In this post, we will get hands-on experience in explaining an ML model... Read more
Before Probability Distributions
I decided to write this introduction to probability distributions with one clear purpose in mind: explain why do we use them, and apply real-life examples. When learning probability, I got tired of hearing about coins tosses, card games and numbered balls. Unless you only love to gamble (which is... Read more