Not only is Machine Learning earning specialists a good salary, but algorithms are being used to make money. It is gaining notoriety for solving just about any problem, dramatically improving technology, breaking barriers, and even worrying some of us. Making money via machine learning tends to... Read more
A probability on its own is often an uninteresting thing. But when we can compare probabilities, that is when their full splendour is revealed. By comparing probabilities we are able form judgements; by comparing probabilities we can exploit the elements of our world that are probable;... Read more
Scratch Viz – Documentation and Usage
Contents Introduction Audience Getting Started Data Scratch Blocks Example Projects Introduction If you have built castles in the air, your work need not be lost; that is where they should be. Now put the foundations under them. Henry David Thoreau Source: Why’s (Poignant) Guide to Ruby This... Read more
Machine Learning vs. Statistics
This was originally posted on the Silicon Valley Data Science blog was co-written by Drew Hardin   The Texas Death Match of Data Science. Throughout its history, Machine Learning (ML) has coexisted with Statistics uneasily, like an ex-boyfriend accidentally seated with the groom’s family at a wedding reception: both uncertain... Read more
Ethics for powerful algorithms (3 of 4)
(Hi, all! Apologies for the long radio silence — my day job has been all-consuming. For those of you joining us for the first time, this series is about the controversies/risks/concerns around using algorithms in the criminal justice system. You might want to check out my first post here,... Read more
This blogpost is about topic modeling using data from this blog, opendatascience.com. From this, combined with the most visited articles of the year, we will generate the most popular topics of 2017. Last year, we did something similar with popular articles streamed through twitter using Non-Negative Matrix Factorization to... Read more
How To Create Data Products That Are Magical Using Sequence-to-Sequence Models
A tutorial on how to summarize text and generate features from Github Issues using deep learning with Keras and TensorFlow. Teaser: Training a model to summarize Github Issues Predictions are in rectangular boxes. The above results are randomly selected elements of a holdout set. Keep reading below, there will be a link... Read more
Watermain Breaks in the City of Toronto
It has been a while since my last post due to the major transition of moving back to Canada. This post will be a bit shorter than my previous ones but hopefully it will give some insight on practically investigating and analyzing open data that are... Read more
In a recent post, I offered a definition of the distinction between data science and machine learning: that data science is focused on extracting insights, while machine learning is interested in making predictions. I also noted that the two fields greatly overlap: I use both machine learning... Read more
We’re on the cusp of a new generation of better and more sophisticated intelligent agents. Intelligent agents are fast becoming ubiquitous in personal life and business, which means they are an important area of opportunity and interest for innovators. As entrepreneurs, designers, product managers, developers, and investors,... Read more