Why Word Vectors Make Sense in Natural Language Processing
If you’re up-to-date with progress in natural language processing research, you’ve probably heard of word vectors in word2vec. Word2vec is a neural network configuration that ingests sentences to learn word embeddings, or vectors of continuous numbers representing individual words. The neural network accepts a word, which is first mapped to a one-hot... Read more
An Idiot’s Guide to Word2vec Natural Language Processing
Word2vec is arguably the most famous face of the neural network natural language processing revolution. Word2vec provides direct access to vector representations of words, which can help achieve decent performance across a variety of tasks machines are historically bad at. For a quick examination of how word vectors work,... Read more
Sentiment Analysis in R Made Simple
Sentiment analysis is located at the heart of natural language processing, text mining/analytics, and computational linguistics. It refers to any measurement technique by which subjective information is extracted from textual documents. In other words, it extracts the polarity of the expressed sentiment in a range spanning from positive to... Read more
This post is the first of a two-part series in which we apply NLP techniques to analyze articles about big data, data science, and AI. If you are tired of the hassles of web scraping, then this post might be just for you. I occasionally web scrape news articles from the... Read more
Multi-Task Learning Objectives for Natural Language Processing
In a previous blog post, I discussed how multi-task learning (MTL) can be used to improve the performance of a model by leveraging a related task. Multi-task learning consists of two main components: a) The architecture used for learning and b) the auxiliary task(s) that are trained jointly. Both facets... Read more
Enhancing Customer Experience with Natural Language Processing
Processing language into actionable components is the future of communication. If you talk to a man in a language he understands, that goes to his head. If you talk to him in his language, that goes to his heart. — Nelson Mandela I would venture to guess that most... Read more
In my last post, I did some natural language processing and sentiment analysis for Jane Austen’s most well-known novel, Pride and Prejudice. It was just so much fun that I wanted to extend some of that work and compare across her body of writing. I decided to make an... Read more
Stupid word games
Today, Jeroen Ooms announced the appearance on CRAN of an R package for language detection, wrapping the “CLD2″ compact language detector.   Obviously, given a tool like that on a holiday long weekend, my first reaction was to try to confuse it. Two fun games to play with a language detector:... Read more
You Must Allow Me To Tell You How Ardently I Admire and Love Natural Language Processing
It is a truth universally acknowledged that sentiment analysis is super fun, and Pride and Prejudice is probably my very favorite book in all of literature, so let’s do some Jane Austen natural language processing. Project Gutenberg makes e-texts available for many, many books, including Pride and Prejudice which... Read more
Intro to Natural Language Processing
Table of Contents 0.0 Setup 0.1 Python and Anaconda 0.2 Libraries 0.3 Other 1.0 Background 1.1 What is NLP? 1.2 Why is NLP Important? 1.3 Why is NLP a “hard” problem? 1.4 Glossary 2.0 Sentiment Analysis 2.1 Preparing the Data 2.1.1 Training Data 2.1.2 Test Data 2.2 Building a... Read more