Creating A Data-Driven Retail Expansion Framework
You’ve opened a business and it’s grown. You opened one or two more locations in places that you thought would be a good fit; maybe you’re Starbucks and have opened thousands more. One of the most important questions a retail entrepreneur or business faces is where... Read more
Top 14 NLP Job-Ready Skills for 2021
NLP was one of the hottest skills in 2019 and  2020 for good reason. Companies have a lot of text to work with and many applicants to apply it across the business. We will discuss the top applications of NLP in part II of this two-part... Read more
The Psychic Syndrome: How the Data Science Community Forgot About the Data
When scrolling through social media in March of this year, I could not help but notice the overwhelming amount of data science projects on COVID-19. At some point, it seemed like all LinkedIn or Twitter consisted of were forecasts of how the pandemic might play out... Read more
How Good are the Visualization Capabilities of Microsoft Power BI?
The number of visuals in Power BI is vast, and the aim of this article is to provide an overview of the Microsoft Power BI data visualization potential to create most of the visuals. This article is an excerpt from the book Microsoft Power BI Quick Start Guide, Second Edition by Devin Knight, Mitchell Pearson, Bradley Schacht, and Erin Ostrowsky – A book that provides an... Read more
Using a Human-in-the-Loop to Overcome the Cold Start Problem in Menu Item Tagging
Originally posted here at Doordash, reposted with permission. Companies with large digital catalogs often have lots of free-text data about their items, but very few actual labels, making it difficult to analyze the data and develop new features. Building a system that can support machine learning... Read more
Understanding the Mechanism and Types of Recurrent Neural Networks
There are numerous machine learning problems in life that depend on time. For example, in financial fraud detection, we can’t just look at the present transaction; we should also consider previous transactions so that we can model based on their discrepancy. Using machine learning to solve such problems is called sequence learning, or sequence... Read more
To be an outstanding data scientist or ML engineer, it doesn’t suffice to only know how to use ML algorithms via the abstract interfaces that the most popular libraries (e.g., scikit-learn, Keras) provide. To train innovative models or deploy them efficiently in production, an in-depth appreciation... Read more
NVIDIA Makes Training GANs Easier with Fewer Images
NVIDIA is closing out 2020 on a strong note with a new method for training GANs that requires significantly less data than current methods. Instead of using hundreds of thousands of images to train efficient GANs with high rates of accuracy, their new technique, adaptive discriminator... Read more
COVID Tracking Project Enhancements to Johns Hopkins Case/Fatality Data
Like many analytics geeks, I’ve been tracking data on the Covid pandemic since early spring. My main source is the Center for Systems Science and Engineering at Johns Hopkins University, with files for download made available at midnight Central time. I’ve established a pretty significant R infrastructure in... Read more
Supply Path Optimization in Video Advertising Landscape
With people moving away from traditional TV and the fact that almost two-thirds of households have a Smart TV, e-Marketers predict that over 80% of online ads will be video in the next couple of years. It’s a smart move by advertisers to slowly shift advertising... Read more