The State of Enterprise NLP in 2020
2020 has been a unique year for public health, professional life, the economy, and just about every other aspect of daily life. While some doors are closing, and others are pivoting their business models, businesses that haven’t taken a hit are a rare breed. Despite this,... Read more
Understanding the Temporal Difference Learning and its Predication 
The temporal difference learning algorithm was introduced by Richard S. Sutton in 1988.  The reason the temporal difference learning method became popular was that it combined the advantages of dynamic programming and the Monte Carlo method. But what are those advantages?  This article is an excerpt from the... Read more
Improving Online Experiment Capacity by 4X with Parallelization and Increased Sensitivity
Originally posted here by Doordash. Data-driven companies measure real customer reactions to determine the efficacy of new product features, but the inability to run these experiments simultaneously and on mutually exclusive groups significantly slows down development. At DoorDash we utilize data generated from user-based experiments to... Read more
Solving for Unobserved Data in a Regression Model Using a Simple Data Adjustment
Article originally posted here by Doordash. Making accurate predictions when historical information isn’t fully observable is a central problem in delivery logistics. At DoorDash, we face this problem in the matching between delivery drivers on our platform, who we call Dashers, and orders in real-time. The... Read more
Getting Started with Dask and SQL
Lots of people talk about “democratizing” data science and machine learning. What could be more democratic — in the sense of widely accessible — than SQL, PyData, and scaling data science to larger datasets and models? Dask is rapidly becoming a go-to technology for scalable computing.... Read more
The Modern Data Scientist at Netflix: Modeling and Tools in Unstable Environments
Netflix is not only one of the most recognized names in the world, but it’s also one of the most recognized names in data science and Dr. Becky Tucker knows all too well about the power of data and what it can tell us. Whether it... Read more
Powerful, Open Source, and Completely Free? HPCC Systems is the Real Deal for Data Lakes
We invite you to learn more about the powerful, open-source HPCC Systems. Our comprehensive, dedicated data lake platform makes combining different types of data easier and faster than competing platforms — even data stored in massive, mixed schema data lakes — and it scales very quickly... Read more
The A – Z of Supervised Learning, Use Cases, and Disadvantages
Analyzing and classifying data is often tedious work for many data scientists when there are massive amounts of data. It even consumes most of their time and decreases their efficiency. Data scientists need to be smart, use cutting edge technologies, take calculated risks, and find out... Read more
Deepnote – A Better Data Science Notebook
With the proliferation of data, notebooks gained popularity in both academia and industry as intuitive tools enabling code writing and execution, visualization, and insights sharing – all within one interface. Notebooks are now the go-to tool for data scientists for exploratory programming but come with their... Read more
Introduction to Observability
When an error occurs within a system, how do we know what error occurred and how to fix it? It has to do with observability — the amount of data that the system itself gives us regarding the context of the error, where it occurred, and... Read more