fbpx
Why the most influential business AIs will look like spellcheckers (and a toy example of how to build one)
Forget voice-controlled assistants. At work, AIs will turn everybody into functional cyborgs through squishy red lines under everything you type. Let’s look at a toy example I just built (mostly to play with deep learning along the way). I chose as a data set Patrick Martinchek’s collection of Facebook... Read more
An Open Source Triple Feature
Editor’s note: The following three experts shared their industry insight at OpenSec2017.   Jen Andre, founder and CEO of Komand.   At Komand, Jen empowers security teams to focus on efficient incident response and decision making by offering the automation of manual tasks, and a space to share this automation... Read more
In this post we will describe how to evaluate a predictive model. Why bother creating complex predictive models if 5% of the customers will churn anyway? Because a predictive model will rank our clients based on the probability that they  will abandon the company. It helps answer these two questions: 1.... Read more
A survey of cross-lingual embedding models
In past blog posts, we discussed different models, objective functions, and hyperparameter choices that allow us to learn accurate word embeddings. However, these models are generally restricted to capture representations of words in the language they were trained on. The availability of resources, training data, and benchmarks in English... Read more
Cognitive Machine Learning: Prologue
Sources of inspiration is one thing we do not lack in machine learning. This is what, for me at least, makes  machine learning research such a rewarding and exciting area to work in. We gain inspiration from our traditional neighbors in statistics, signal processing and control engineering, information theory and statistical physics.... Read more
Statistics, Simians, the Scottish, and Sizing up Soothsayers
A predictive model can be a parametrized mathematical formula, or a complex deep learning network, but it can also be a talkative cab driver or a slides-wielding consultant. From a mathematical point of view, they are all trying to do the same thing, to predict what’s going to happen,... Read more
How to visualize decision trees in Python
Decision tree classifier is the most popularly used supervised learning algorithm. Unlike other classification algorithms, decision tree classifier in not a black box in the modeling phase.  What that’s means, we can visualize the trained decision tree to understand how the decision tree gonna work for the give input... Read more
Generalizing Abstract Arrays: opportunities and challenges
Introduction: generic algorithms with AbstractArrays Somewhat unusually, this blog post is future-looking: it mostly focuses on things that don’t yet exist. Its purpose is to lay out the background for community discussion about possible changes to the core API for AbstractArrays, and serves as background reading and reference material... Read more
Drawing a map of distributed data systems
How we created an illustrated guide to help you find your way through the data landscape. Designing Data-Intensive Applications, the book I’ve been working on for four years, is finally finished, and should be available in your favorite bookstore in the next week or two. An incomplete beta (Early... Read more
More notebooks for Think Stats
More notebooks for Think Stats As I mentioned in the previous post, I am getting ready to teach Data Science in the spring, so I am going back through Think Stats and updating the Jupyter notebooks.  I am done with Chapters 1 through 6 now. If you are reading the book, you... Read more