From Idea to Insight: Using Bayesian Hierarchical Models to Predict Game Outcomes Part 1
Imagine you’re a data scientist at an online mobile multiplayer competition platform. Your bosses have a vested interest in paying people with our skillset to predict game outcomes for a variety of commercial applications they profit from, for example, setting odds and sharing better insights with game developers on... Read more
Experiment Management: How to Organize Your Model Development Process
Let me share a story that I’ve heard too many times. ” So I was developing a machine learning model with my team and within a few weeks of extensive experimentation, we got promising results… …unfortunately, we couldn’t tell exactly what performed best because we didn’t track feature versions, didn’t record... Read more
Best Deep Reinforcement Learning Research of 2019
Since my mid-2019 report on the state of deep reinforcement learning (DRL) research, much has happened to accelerate the field further. Read my previous article for a bit of background, brief overview of the technology, comprehensive survey paper reference, along with some of the best research papers at that... Read more
Top 10 AI Chatbot Research Papers from arXiv.org in 2019
AI chatbots are a hot commodity right now and they constitute a fertile area of research for machine learning. Researchers from all over the globe are working hard to push the envelope for what we can expect from chatbots. In this article, I’ve scoured the arXiv.org pre-print server for... Read more
How to Use Excel in Data Science for 2020
Wait, don’t leave! Excel has a terrible reputation in data science, and there is about 20 years’ worth of literature cautioning against the use of Excel in data science. There are better, faster, more agile programs that spit fancier representations and offer cooler capabilities. And here’s the lowly Excel... Read more
Decision Intelligence and Why Goldilocks Made Bad Choices
Spoiler: Goldilocks survives. Let us first start by saying things could have turned out much worse. Really worse. Don’t walk into a stranger’s house and eat their food and nap in their house. This is just a no-brainer! In the real world when we make decisions there is a much... Read more
Introduction to Spark NLP: Foundations and Basic Components
Veysel is a speaker for ODSC East 2020 this April 13-17! Be sure to check out his talk, “Spark NLP for Healthcare: Lessons Learned Building Real-World Healthcare AI Systems,” there! * This is the first article in a series of blog posts to help Data Scientists and NLP practitioners... Read more
Using the CNN Architecture in Image Processing
Convolutional Neural Networks (CNNs) leverage spatial information, and they are therefore well suited for classifying images. These networks use an ad hoc architecture inspired by biological data taken from physiological experiments performed on the visual cortex. Our vision is based on multiple cortex levels, each one recognizing more and... Read more
Using the NGBoost Algorithm
Data scientists competing in Kaggle competitions often come up with winning solutions using ensembles of advanced machine learning algorithms. One particular model that is typically part of such ensembles is Gradient Boosting Machines (GBMs). Gradient boosting is a machine learning method used for the solution of regression and classification... Read more
24 Evaluation Metrics for Binary Classification (And When to Use Them)
 Not sure which evaluation metric you should choose for your binary classification problem? After reading this blog post you should have a good idea. You will learn about a bunch of common and lesser-known evaluation metrics and charts to understand how... Read more