AI-Identified Health Policies, Hate Speech Detection Among September Industry Research
September has been an impressive month for data science research. Here, we highlight a few innovative and explosive studies released on the arXiv research aggregator out of Cornell University Library. This research dives into some of the most important facets of data science today, including deep learning, machine learning,... Read more
Analyze a Soccer Game Using Tensorflow Object Detection and OpenCV
Introduction The world cup season is here and off to an interesting start. Whoever thought the reigning champions Germany would be eliminated in the group stage¬†ūüôĀ For the data scientist within you, lets use this opportunity to do some analysis on soccer clips. With the use of deep learning... Read more
Mastering the Mystical Art of Model Deployment
With all the talk about algorithm selection, hyper parameter optimization and so on, you could think that training models is the hardest part of the Machine Learning process. However, in my experience, the really tricky step is to deploy these models safely in a web production environment. In this... Read more
Mapping the Shifting Constellations of Online Debate
Online conversations, specially around contentious topics, are complex and dynamic. Mapping them is not just a matter of gathering enough data and applying sophisticated algorithms. It’s critical to adjust the map to the questions you want to answer; like models in general, no map is true, but some are... Read more
Learning with A/B Testing
I’ve spent the last 6 years of my life heavily involved in A/B testing, and other testing methodologies.  Whether it was the performance of an email campaign to drive health outcomes,  product changes, Website changes, the example list goes on. A few of these tests have been full factorial... Read more
How Well Did Data Scientists Predict the 2018 World Cup? (Hint: Not Very)
This year’s World Cup in Russia was the most watched sporting event in history. GlobalWebIndex reports that up to 3.4 billion people Рaround half of the world’s population Рwatched some part of the tournament. As with past World Cups, a global prediction market emerged allowing spectators to... Read more
A Real World Reinforcement Learning Research Program
We are hiring for reinforcement learning related research at all levels and all MSR labs. If you are interested, apply, talk to me at COLT or ICML, or email me. More generally though, I wanted to lay out a philosophy of research which differs from (and plausibly improves on) the current prevailing mode. Deepmind and OpenAI have popularized an... Read more
My Latent Dissatisfaction with Modern ML
It took reading Judea Pearl’s “The Book of Why”, and Jonas Peters’ mini-course on causality, for me to finally figure out why I had this lingering dissatisfaction with modern machine learning. It’s because modern machine learning (deep learning included) is most commonly used as a tool in the service... Read more
New Approximate Nearest Neighbor Benchmarks
As some of you may know, one of my side interests is approximate nearest neighbor algorithms. I’m the author of Annoy, a library with 3,500+ stars on Github as of today. It offers fast approximate search for nearest neighbors with the additional benefit that you can load data super fast... Read more
A Different Use of Time Series to Identify Seasonal Customers
I had previously written about creatively leveraging your data using segmentation to learn about a customer base.  The article is here.  In the article I mentioned utilizing any data that might be relevant.  Trying to identify customers with seasonal usage patterns was one of the variables that I mentioned that sounded interesting. ... Read more
Open Data Science - Your News Source for AI, Machine Learning & more