This blog post is on song lyric sentiment. Feel free to fork this code from GitHub. Sentiment Analysis is one of the techniques of NLP (Natural Language Processing). It is part of NLU (Natural Language Understanding). It allows us to classify the sentiment of a text,... Read more
Last Saturday, in the UEFA Champions League final (think of it as Europe’s Super Bowl), Spanish giants Real Madrid defeated their Italian counterparts Juventus FC 4-1. It was a thrilling match, that saw both sides staking an equal claim to winning the match in the first half, with... Read more
Topic Modeling with LDA Introduction
Suppose you have the following set of sentences: I eat fish and vegetables. Fish are pets. My kitten eats fish. Latent Dirichlet allocation (LDA) is a technique that automatically discovers topics that these documents contain. Given the above sentences, LDA might classify the red words under the Topic... Read more
Deciphering the Neural Language Model
Recently, I have been working on the Neural Networks for Machine Learning course offered by Coursera and taught by Geoffrey Hinton. Overall, it is a nice course and provides an introduction to some of the modern topics in deep learning. However, there are instances where the student... Read more
Why the most influential business AIs will look like spellcheckers (and a toy example of how to build one)
Forget voice-controlled assistants. At work, AIs will turn everybody into functional cyborgs through squishy red lines under everything you type. Let’s look at a toy example I just built (mostly to play with deep learning along the way). I chose as a data set Patrick Martinchek’s... Read more
A survey of cross-lingual embedding models
In past blog posts, we discussed different models, objective functions, and hyperparameter choices that allow us to learn accurate word embeddings. However, these models are generally restricted to capture representations of words in the language they were trained on. The availability of resources, training data, and... Read more
You Must Allow Me To Tell You How Ardently I Admire and Love Natural Language Processing
It is a truth universally acknowledged that sentiment analysis is super fun, and Pride and Prejudice is probably my very favorite book in all of literature, so let’s do some Jane Austen natural language processing. Project Gutenberg makes e-texts available for many, many books, including Pride... Read more
Hello all and welcome to the second of the series – NLP with NLTK. The first of the series can be found here, incase you have missed. In this article we will talk about basic NLP concepts and use NLTK to implement the concepts. Contents: Corpus... Read more
On word embeddings – Part 2: Approximating the Softmax
Table of contents: Softmax-based Approaches Hierarchical Softmax Differentiated Softmax CNN-Softmax Sampling-based Approaches Importance Sampling Adaptive Importance Sampling Target Sampling Noise Contrastive Estimation Negative Sampling Self-Normalisation Infrequent Normalisation Other Approaches Which Approach to Choose? Conclusion This is the second post in a series on word embeddings and... Read more
ftfy (fixes text for you) 4.4 and 5.0
ftfy is Luminoso’s open-source Unicode-fixing library for Python. Luminoso’s biggest open-source project is ConceptNet, but we also use this blog to provide updates on our other open-source projects. And among these projects, ftfy is certainly the most widely used. It solves a problem a lot of... Read more