Making a machine learning model usually takes a lot of crying, pain, feature engineering, suffering, training, debugging, validation, desperation, testing and a little bit of agony due to the infinite pain. After all that, we deploy the model and use it to make predictions for future data. We can run our... Read more
Deep Learning Research Review Week 3: Natural Language Processing
This is the 3rd installment of a new series called Deep Learning Research Review. Every couple weeks or so, I’ll be summarizing and explaining research papers in specific subfields of deep learning. This week focuses on applying deep learning to Natural Language Processing. The last post was Reinforcement Learning... Read more
You weren’t supposed to actually implement it, Google
Last month, I wrote a blog post warning about how, if you follow popular trends in NLP, you can easily accidentally make a classifier that is pretty racist. To demonstrate this, I included the very simple code, as a “cautionary tutorial.” The post got a fair amount... Read more
A decade of using text-mining for citation function classification
Academic work is typically filled with references to previous work. Unfortunately, most of these references have, at best, a tangential relevance. Thus you cannot trust that a paper that cites another actually “builds on it”. A more likely scenario is that the authors of the latest... Read more
Linked Data and Data Science
Understanding Gender Roles in Movies with Text Mining
I have a new visual essay up at The Pudding, using text mining to explore how women are portrayed in film.   In April 2016, we broke down film dialogue by gender. The essay presented an imbalance in which men delivered more lines than women across 2,000 screenplays. But... Read more
Natural Language Processing in a Kaggle Competition for Movie Reviews
I decided to try playing around with a Kaggle competition. In this case, I entered the “When bag of words meets bags of popcorn” contest. This contest isn’t for money; it is just a way to learn about various machine learning approaches. The competition was trying to showcase... Read more
Word2Vec – the world of word vectors
Have you ever wondered how a chatbot can learn about the meaning of words in a text? Does this sound interesting? Well, in this blog we will describe a very powerful method, Word2Vec, that maps words to numbers (vectors) in order to easily capture and distinguish their meaning. We will briefly describe how Word2Vec... Read more
Word embeddings in 2017: Trends and future directions
Table of contents: Subword-level embeddings OOV handling Evaluation Multi-sense embeddings Beyond words as points Phrases and multi-word expressions Bias Temporal dimension Lack of theoretical understanding Task and domain-specific embeddings Embeddings for multiple languages Embeddings based on other contexts The word2vec method based on skip-gram with negative... Read more
See first-hand how you can bring an NLP application to life. Last week I introduced No Jitter readers to Natural Language Processing (NLP) and Facebook’s free NLP service, wit.ai. I wrote about intents and entities, and how they work together to convert human language into actionable... Read more