fbpx
Text Classification in Python
This article is the first of a series in which I will cover the whole process of developing a machine learning project. This one focuses on training a supervised learning text classification model in Python. The motivation behind writing these articles is the following: as a learning data scientist who has... Read more
9 Organizations and People Leading the NLP Field
Got a keen interest in where NLP is headed? Who doesn’t? It’s one of the most exciting developments we have in AI, and it’s making waves in every industry imaginable. If you’re trying to keep up with all the advancements, we’ve got nine leaders in NLP you need to... Read more
What do Data Scientists and Decision Makers Need to Know About Google’s BERT
Any data scientist will tell you that one of the most challenging parts of natural language processing projects is the lack (or shortage) of training data. With deep learning, this has been semi-solved, but now the problem can be too much data—up to millions or even billions of training... Read more
State-of-the-Art Natural Language Understanding at Scale
For many of you in data science, natural language processing is a critical component of your projects. David Talby of Pacific.ai is here to introduce Apache Spark’s new NLP library and outline how it can facilitate your NLP pipeline for higher accuracy and faster results using the same amount... Read more
Building a Natural Language Question & Answer Search Engine
Didn’t have time to read the book for the big quiz?  Why not build a system to answer the questions for you? Using the architecture pictured below we can build out a framework that can accept natural language questions as a query and answer the question using a corpus... Read more
What Businesses Should Know About Speech Technologies
One of the top workshops at ODSC West last year (2018) was a talk by Omar Tawakol, the founder of Voicea. His company created a voice assistant that transformed meetings by handling lower order tasks like note-taking. Cisco acquired Voicea, most likely to integrate it into Webex as part... Read more
Do Android Composers Dream of Electric Keyboards?
Editor’s Note: If you’re interested in the idea of AI with a dream of electric keyboards, see Joseph’s talk “The Soul of a New AI” at ODSC Europe 2019. My journey in AI begins with grammar. Raised in a mathematical home, I think I was discovering prime numbers when... Read more
Watch: Effective Transfer Learning for NLP
Transfer learning, the practice of applying knowledge gained on one machine learning task to aid the solution of a second task, has seen historic success in the field of computer vision. The output representations of generic image classification models trained on ImageNet have been leveraged to build models that... Read more
Watch: State of the Art Natural Language Understanding at Scale
Natural language understanding is a key component in many data science systems that must understand or reason about text. Common use cases include question answering, paraphrasing or summarization, sentiment analysis, natural language BI, language modeling, and disambiguation. Building such systems usually requires combining three types of software libraries: NLP... Read more
Watch: Understanding Unstructured Data with Language Models
As data scientists, we’ve seen a rapid improvement in the last decades in the tools available for working with structured data (be it tabular data, graph data, sensor data etc.). Yet, the vast majority of our data (Merrill Lynch puts the figure at roughly 90%) is *unstructured*, and lives... Read more