The History of Neural Networks and AI: Part III
This article is the third and final article in a three-part series about the history of neural networks and artificial intelligence. To view the first article that dives into the earliest developments of artificial intelligence, click here. For a better picture of how neural networks and... Read more
The open-source project, Data Science Live Book, is now available!
For those just beginning to embark on a data science career, data scientist Pablo Casas’ Data Science Live Book offers a guided path into the nitty-gritty of the field. Casas presently works as a Machine Learning Specialist at Auth0.com, and his book sheds light on many... Read more
Graph Algorithms and Software Prefetching
A lot of data in the real world can be represented as graphs: you have nodes connected through edges. For example, you are a node in a graph where friendships are edges. I recently met with professor Semih Salihoglu, an expert in graph algorithms and databases. We... Read more
Robustness and Tests for Equal Variance
The two-sample t-test is a way to test whether two data sets come from distributions with the same mean. I wrote a few days ago about how the test performs under ideal circumstances, as well as less than ideal circumstances. This is an analogous post for testing whether two... Read more
Ready Made Plots Make Work Easier
A while back Simon Jackson and Kara Woo shared some great ideas and graphs on grouped bar charts and density plots (link).Win-Vector LLC‘s Nina Zumel just added a graph of this type to the development version of WVPlots. Nina has, as usual, some great documentation here. More and more I am finding when you... Read more
The History of Neural Networks and AI: Part II
This article is the second article in a three-part series about the history of neural networks and artificial intelligence. To view the first article, click here. The History of Neural Networks, Continued After the beginning era of AI, a British researcher specializing in artificial intelligence, Donald... Read more
Data Visualization Throughout the Data Science Workflow: Part 1
Scientists, including data scientists, often focus on numerical methods and analyses, glossing over visualization and communication. We want to believe that “the data speaks for itself.”1 However, data visualization is an essential tool in a data scientist’s toolbox. Data visualization allows you to see patterns that would... Read more
Demystifying Black-Box Models with SHAP Value Analysis
As an Applied Data Scientist at Civis, I implement the latest data science research to solve real-world problems. We recently worked with a global tool manufacturing company to reduce churn among their most loyal customers. A newly proposed tool, called SHAP (SHapley Additive exPlanation) values, allowed us... Read more
“I hate math!” – Education and Artificial Intelligence to find a meaning
Well, what you hate is the way that math was taught to you. That soup of equations, abstractions, and solutions to problems that we don’t know, It’s hard to enjoy the things we don’t feel part of. But how about relating some math techniques from the... Read more
An Overview of Proxy-label Approaches for Semi-supervised Learning
Note: Parts of this post are based on my ACL 2018 paper Strong Baselines for Neural Semi-supervised Learning under Domain Shift with Barbara Plank. Table of contents: Self-training Multi-view training Co-training Democratic Co-learning Tri-training Tri-training with disagreement Asymmetric tri-training Multi-task tri-training Self-ensembling Ladder networks Virtual Adversarial Training ΠΠ model Temporal... Read more