Stupid word games
Today, Jeroen Ooms announced the appearance on CRAN of an R package for language detection, wrapping the “CLD2″ compact language detector.   Obviously, given a tool like that on a holiday long weekend, my first reaction was to try to confuse it. Two fun games to play with a language detector:... Read more
Machine Learning: An In-Depth Guide – Overview, Goals, Learning Types, and Algorithms
Articles Overview, goals, learning types, and algorithms Data selection, preparation, and modeling Model evaluation, validation, complexity, and improvement Model performance and error analysis Unsupervised learning, related fields, and machine learning in practice Introduction Welcome! This is the first article of a five-part series about machine learning. Machine learning is... Read more
 This blog post is on song lyric sentiment. Feel free to fork this code from GitHub. Sentiment Analysis is one of the techniques of NLP (Natural Language Processing). It is part of NLU (Natural Language Understanding). It allows us to classify the sentiment of a text, positive or negative,... Read more
Last Saturday, in the UEFA Champions League final (think of it as Europe’s Super Bowl), Spanish giants Real Madrid defeated their Italian counterparts Juventus FC 4-1. It was a thrilling match, that saw both sides staking an equal claim to winning the match in the first half, with Madrid eventually prevailing... Read more
Topic Modeling with LDA Introduction
Suppose you have the following set of sentences: I eat fish and vegetables. Fish are pets. My kitten eats fish. Latent Dirichlet allocation (LDA) is a technique that automatically discovers topics that these documents contain. Given the above sentences, LDA might classify the red words under the Topic F, which we... Read more
Deciphering the Neural Language Model
Recently, I have been working on the Neural Networks for Machine Learning course offered by Coursera and taught by Geoffrey Hinton. Overall, it is a nice course and provides an introduction to some of the modern topics in deep learning. However, there are instances where the student has to do... Read more
Why the most influential business AIs will look like spellcheckers (and a toy example of how to build one)
Forget voice-controlled assistants. At work, AIs will turn everybody into functional cyborgs through squishy red lines under everything you type. Let’s look at a toy example I just built (mostly to play with deep learning along the way). I chose as a data set Patrick Martinchek’s collection of Facebook... Read more
A survey of cross-lingual embedding models
In past blog posts, we discussed different models, objective functions, and hyperparameter choices that allow us to learn accurate word embeddings. However, these models are generally restricted to capture representations of words in the language they were trained on. The availability of resources, training data, and benchmarks in English... Read more
You Must Allow Me To Tell You How Ardently I Admire and Love Natural Language Processing
It is a truth universally acknowledged that sentiment analysis is super fun, and Pride and Prejudice is probably my very favorite book in all of literature, so let’s do some Jane Austen natural language processing. Project Gutenberg makes e-texts available for many, many books, including Pride and Prejudice which... Read more
Hello all and welcome to the second of the series – NLP with NLTK. The first of the series can be found here, incase you have missed. In this article we will talk about basic NLP concepts and use NLTK to implement the concepts. Contents: Corpus Tokenization/Segmentation Frequency Distribution... Read more