Text Analysis in Excel: Real world use-cases
Last month, we launched an Excel add-in, a solution for using ParallelDots NLP APIs to do text analysis on unstructured data without writing a single line of code. The Excel add-in is very easy to use and provides a convenient, yet effective solution for your text analysis needs. In an... Read more
Making a machine learning model usually takes a lot of crying, pain, feature engineering, suffering, training, debugging, validation, desperation, testing and a little bit of agony due to the infinite pain. After all that, we deploy the model and use it to make predictions for future data. We can run our little devil on a batch... Read more
Deep Learning Research Review Week 3: Natural Language Processing
This is the 3rd installment of a new series called Deep Learning Research Review. Every couple weeks or so, I’ll be summarizing and explaining research papers in specific subfields of deep learning. This week focuses on applying deep learning to Natural Language Processing. The last post was Reinforcement Learning and the post before was... Read more
You weren’t supposed to actually implement it, Google
Last month, I wrote a blog post warning about how, if you follow popular trends in NLP, you can easily accidentally make a classifier that is pretty racist. To demonstrate this, I included the very simple code, as a “cautionary tutorial.” The post got a fair amount of reaction. Much... Read more
A decade of using text-mining for citation function classification
Academic work is typically filled with references to previous work. Unfortunately, most of these references have, at best, a tangential relevance. Thus you cannot trust that a paper that cites another actually “builds on it”. A more likely scenario is that the authors of the latest paper did not... Read more
Linked Data and Data Science
Understanding Gender Roles in Movies with Text Mining
I have a new visual essay up at The Pudding, using text mining to explore how women are portrayed in film.   In April 2016, we broke down film dialogue by gender. The essay presented an imbalance in which men delivered more lines than women across 2,000 screenplays. But quantity of lines... Read more
Natural Language Processing in a Kaggle Competition for Movie Reviews
I decided to try playing around with a Kaggle competition. In this case, I entered the “When bag of words meets bags of popcorn” contest. This contest isn’t for money; it is just a way to learn about various machine learning approaches. The competition was trying to showcase Google’s Word2Vec. This essentially... Read more
Word2Vec – the world of word vectors
Have you ever wondered how a chatbot can learn about the meaning of words in a text? Does this sound interesting? Well, in this blog we will describe a very powerful method, Word2Vec, that maps words to numbers (vectors) in order to easily capture and distinguish their meaning. We will briefly describe how Word2Vec works without going... Read more
Word embeddings in 2017: Trends and future directions
Table of contents: Subword-level embeddings OOV handling Evaluation Multi-sense embeddings Beyond words as points Phrases and multi-word expressions Bias Temporal dimension Lack of theoretical understanding Task and domain-specific embeddings Embeddings for multiple languages Embeddings based on other contexts The word2vec method based on skip-gram with negative sampling (Mikolov et... Read more