Research Note: What Are Natural Experiments? Methods, Approaches, and Applications
I enjoy reading Craig et al. (2017) ‘s review article on Natural Experiments (An Overview of Methods, Approaches, and Contributions to Public Health Intervention Research). In this post, I want to summarize its key points and attach some of my reflections about the development of causal inference. This review article... Read more
Build a First Neural Network
Neural networks are weirdly good at translating languages and identifying dogs by breed, but they can be intimidating to get started with. In an effort to smooth this on-ramp, I created a neural network framework specifically for teaching and experimentation. It’s called Cottonwood and this notebook shows how to... Read more
A Concrete Application of Topological Data Analysis
Today, I will present a Machine Learning application of Topological Data Analysis (TDA), a rapidly evolving field of data science that makes use of topology to improve data analysis. It is largely inspired by one of my projects. Great! Wait… what is TDA? I will start by briefly recalling the basics... Read more
From Idea to Insight: Using Bayesian Hierarchical Models to Predict Game Outcomes Part 2
What’s the best way to model the probability that one player beats another in a digital game a client of your employer designed? This is the second of a two-part series in which you’re a data scientist at a fictional mobile game development company that makes money by monetizing... Read more
Not Quite a Perfect Model Stack
In model building, the power of the majority can be a great thing. For those scholars of democracy, this does not refer to Alexis de Tocqueville’s tyranny in the power of majority. I apologize as that is probably a poor pun and may be a bit of a nerdy... Read more
Missing Data in Supervised Machine Learning
Editor’s note: Andras is a speaker for ODSC East 2020! this April 13-17 Be sure to check out his talk, “Missing Data in Supervised Machine Learning” there. Datasets are almost never complete and this can introduce various biases to your analysis. Due to these biases, your supervised machine learning... Read more
Major Updates to the Most Popular Data Science Frameworks in 2019
This time last year we brought you a detailed report of all the important updates for popular data science (machine learning and deep learning) frameworks throughout 2018. The developers of these frameworks continue to innovate at an accelerated rate. Data scientists demand more powerful tools in order to get... Read more
Top 7 Machine Learning Frameworks for 2020
Machine learning is a nightmare without some kind of structure. You can’t build everything from scratch, especially if you’re in a business setting. Even if you want to (and if you do, comment here and tell us about it!), you don’t have time in most cases. You need a... Read more
How To Manage Data Science Projects In 2020
Learning data science and doing it are two different things. At school, stats professors teach us how to curve-fit the “perfect” machine learning model but do not teach us how to be practical, how to manage a project, and how to listen to clients’ needs. [Related Article: Are You... Read more
Interpreting the 2020 Puerto Rico Earthquake Swarm with Data Science
Using visualizations, maps, time series and Google Trends data science, Puerto Rico earthquakes are described. Since late December 2019 until early January 2020, the southwestern region of the island Puerto Rico has been experiencing a series or swarm of earthquakes, leaving in its wake a trail of destruction and... Read more