A lot goes into NLP. Languages, dialects, unstructured data, and unique business needs all contribute to requiring constant innovation from the field. Going beyond NLP platforms and skills alone, having expertise in novel processes, and staying afoot in the latest research are becoming pivotal for effective NLP implementation. We looked at a number of NLP sessions coming to ODSC West 2022 this November 1st-3rd that highlight change in the growing field and to perform NLP better.
1. Self-Supervised and Unsupervised Learning
Self-supervised and unsupervised learning techniques such as few-shot and zero-shot learning are changing the shape of AI research and communities. We have seen these techniques advancing multiple fields in AI such as NLP, computer vision, robotics, and more. In this talk, Chandra will give some background information on conversational AI and NLP, along with self-supervised and unsupervised techniques. He will walk the audience through hands-on examples and how they can leverage transformers and large language models for few-shot and zero-shot learning in a variety of NLP applications such as text classification, summarization, and question-answering.
Session: Self-Supervised and Unsupervised Learning for Conversational AI and NLP | Chandra Khatri | Chief Scientist and Head of AI | Got It AI
2. Building Modern Search Pipelines
In this talk, we navigate through the latest buzz around semantic search and separate the noise from the meaningful advancements. Is dense retrieval better than BM25’s keyword search? Do large language models outperform smaller transformers? How well do the models generalize to industry corpora? How can we leverage Question Answering?
We will benchmark different methods, share best practices from industry use cases and show how you can use the open source framework Haystack to build, test and deploy stellar search pipelines easily yourself.
Session: Building Modern Search Pipelines with Haystack, Large Language Models, and Hybrid Retrieval | Malte Pietsch | CTO & Co-Founder | deepset
3. Applied NLP
Research suggests that 80-90% of data within any particular organization is unstructured, and much of this data is text. In order to make use of this wealth of text data, organizations have been turning to Natural Language Processing techniques. IBM’s 2021 Global AI Adoption Index showed NLP is at the forefront of AI adoption with one in four businesses reporting adopting this type of technology within a year. This is being enabled by a wide array of open-source NLP libraries such as spaCy and HuggingFace’s Transformers.
In this workshop, we will explore some of these popular NLP techniques that have broad applicability. From the basics of bagging and word vectors to the creation of contextualized representations of words and sentences, the workshop will equip participants with the tools they need to turn messy text data into useful insights.
Session: Bagging to BERT – A Tour of Applied NLP | Benjamin Batorsky, PhD | Senior Data Scientist | Institute for Experiential AI at Northeastern University
4. NLP in eCommerce/Retail
This talk covers 3 examples of using NLP to solve problems in a retail e-commerce context. The NLP techniques are topic modeling and string similarity. All the code examples use open source python libraries. The business contexts in which they will be discussed are identifying customer complaints from online reviews, identifying sample products and identifying similar products. Customer complaints are identified from a corpus of google reviews. The store operations team conducted this exercise to find out how often customers complain about problems like lack of adequate parking in large stores.
Session: Applications of NLP in Retail/E-commerce | Shoili Pal | Data Scientist | The Home Depot
5. Being Productive with NLP
In this workshop, you’ll walk through a complete end-to-end example of using Hugging Face Transformers, involving both our open-source libraries and some of our commercial products. Starting from a dataset containing real-life product reviews from Amazon.com, you’ll train and deploy a text classification model predicting the star rating for similar reviews.
Session: Hyper-productive NLP with Hugging Face Transformers | Julien Simon | Chief Evangelist | Hugging Face
6. NLP for Language Diversity
Low-resource languages present a challenge for data-hungry approaches to machine translation, speech recognition, and other technologies that promise to open the way for universal participation in the global information society. In this talk, Steven will present a new perspective on the language technology for all (LT4All) agenda, beginning with the structure of the world’s linguistic diversity and the actual linguistic challenges on the ground. Steven will draw on experiences working in societies where there is no clear case for the popular practice of replicating human capabilities in translation or speech recognition, but where there are myriad other opportunities for language technologies.
Session: The Next Thousand Languages | Steven Bird, PhD | Professor | Charles Darwin University
Perform NLP Better with Training at ODSC West 2022
We just listed off quite a few skills, platforms, topics, and frameworks. It’s not expected to know every single thing mentioned above, but knowing a good chunk of them – and how to apply them in business settings – will help you get a job or become better at your current one. At ODSC West 2022, we have an entire track devoted to NLP. Learn NLP skills and platforms like the ones listed above!