The last several years have seen a dramatic increase in the popularity of NLP. With significant developments like BERT and GPT-3, NLP is now at the forefront of the data science industry. At ODSC East, there will be several sessions addressing the latest developments in NLP. Below are just a few of the sessions that will be featured at ODSC East.
Leonardo De Marchi | Head of Data Science and Analytics | Badoo (now MagicLab)
This course will cover NLP fundamentals, such as pre-processing techniques,tf-idf, embeddings, and more. It will also feature practical coding examples, in python, to teach how to apply the theory to real use cases.
Build a Question Answering System using DistilBERT in Python
Jayeeta Putatunda | Data Scientist | MediaMath
This session will cover not only the general concepts and way to implement DistilBERT, but also discuss how to fine-tune the base model to build an efficient question-answering model.
Art of BERT: Unlock the Full Potential of BERT for Domain-Specific Tasks (TensorFlow)
Thushan Ganegedara | Senior Data Scientist, AI&ML Instructor | QBE Insurance, DataCamp
This workshop will focus on how to improve your output from models like BERT. It will explore techniques to suit it for the domain-specific task at hand using an example from the financial domain. But the methods are generalizable to any other domain.
Going from Text to Knowledge Graphs: Putting Natural Language Processing and Graph Databases to Work
Dr. Clair Sullivan | Graph Data Science Advocate | Neo4j
In this workshop, you will generate a knowledge graph using an open-source data set of text and explore the issues associated with generating such knowledge graphs, such as entity disambiguation and the lack of sufficient training data (zero-shot learning).
Narrative Extraction for Disinformation Detection
Amber Chin | Machine Learning Engineer | Novetta and Carlos Martinez | Machine Learning Engineer | Novetta
In this talk, you will examine an NLP-based method used for combining open-source deep learning models (BERT, GPT-2) and topic modeling (LDA) to identify disinformation narratives in articles. You will also explore how this approach can be used outside the realm of disinformation narrative detection, specifically as a tool to analyze public responses to new products, brands, or even government policies.
Brand Voice: Deep Learning for Speech Synthesis
Francesco Cardinale | Machine Learning Research Engineer | Axel Springer and Christian Schäfer, PhD | Senior Machine Learning Engineer | Axel Springer
This session will examine how to tackle many of the issues that arise in text-to-speech, such as very long output sequences, slow inference speed and long training times (over 2 weeks), and no explicit evaluation metric that correlates with perceived audio quality.
The Healthy Approach – Organic Data Enrichment Through Entity Extraction
Julia Neagu, PhD | Director of Analytics | Tamr and Ian Bakst, PhD | DataOps Engineer | Tamr
In this session, you will explore how Named Entity Recognition algorithms can expand the accessible information in a dataset by extracting known entities from unstructured attributes, using a Recurrent Neural Net (RNN) as an example.
Advances in Conversational AI and NLP through Large Scale Language Models such as GPT-3
Chandra Khatri | Chief Scientist and Head of Conversational AI | Got It AI
In this talk, you will cover some of the background in Conversational AI, NLP and Transformers based Large Scale Language Models such as BERT and GPT-3. You will also explore hands-on examples and how they can leverage such techniques in their applications.