Find Where to Park in Real Time Using OpenCV and TensorFlow
Introduction How many times has it happened to you that you are searching for a parking spot by driving around and around the parking lot. How convenient would it be if your phone could tell you exactly where the closest parking spot is! It turns out that this is... Read more
Why Word Vectors Make Sense in Natural Language Processing
If you’re up-to-date with progress in natural language processing research, you’ve probably heard of word vectors in word2vec. Word2vec is a neural network configuration that ingests sentences to learn word embeddings, or vectors of continuous numbers representing individual words. The neural network accepts a word, which is first mapped to a one-hot... Read more
An Idiot’s Guide to Word2vec Natural Language Processing
Word2vec is arguably the most famous face of the neural network natural language processing revolution. Word2vec provides direct access to vector representations of words, which can help achieve decent performance across a variety of tasks machines are historically bad at. For a quick examination of how word vectors work,... Read more
15 Common Mistakes Made By Newbie Data Scientists
Junior data scientists are flooding the field as more and more people are transitioning from other areas, some very loosely related to data-driven professions. As a result, there often is a disconnect with the skillsets these “newbies” bring to the table. After all, there is only so much that... Read more
Dewey Defeats Truman: How Sampling Bias can Ruin Your Model
In the 1948 election season, Thomas Dewey faced off against incumbent Harry Truman for the presidency, running on the Republican and Democratic ticket, respectively. The Chicago Daily Tribune, a Republican-leaning paper at the time, ran a poll forecasting the outcome of the election, with a decisive win for Dewey... Read more
Self-Driving Cars, Generated News Among Top October Research
Self-driving cars. Less biased crowdsourced data. Automatically generated historical accounts. These are some of the topics data science researchers across the world tackled and published to the arXiv research aggregator out of Cornell University Library in October. Learn about revelations researchers made and how they applied machine learning, deep... Read more
Building Neural Networks with Perceptron, One Year Later — Part III
This is the third part in a three-part series. The first part can be read here and the second part here. Inside Perceptron Each neuron in a neural network will, at some point, have a value. Each weight (the neuron links) will also have a value, all of which the user... Read more
Exploring the Central Limit Theorem in R
The Central Limit Theorem (CLT) is arguably the most important theorem in statistics. It’s certainly a concept that every data scientist should fully understand. In this article, we’ll go over some basic theory of the CLT, explain why it’s important for data scientists, and present some R code that... Read more
Tracking the Progress in Natural Language Processing
This post introduces a resource to track the progress and state-of-the-art across many tasks in NLP. Go directly to the document tracking the progress in NLP. Research in machine learning and in natural language processing (NLP) is moving so fast these days, it is hard to keep up. This... Read more
Building a Custom Mask RCNN Model with TensorFlow Object Detection
Doing cool things with data! You can now build a custom Mask RCNN model using TensorFlow Object Detection Library! Mask RCNN is an instance segmentation model that can identify pixel-by-pixel location of any object. This article is the second part of my popular post where I explain the basics of Mask RCNN... Read more