fbpx
4 Easy Methods to Tokenize Your Data
Recently, I have been exploring the world of Natural Language Processing (NLP). This field is in the intersection of Machine Learning, Linguistics, and Computer Science and deals with how computers interpret and use language. It is one of the most exciting parts of Data Science as... Read more
Overcoming the Social Biases in Natural Language Processing Systems
Editor’s note: Danushka Bollegala is a speaker for ODSC Europe 2022. Be sure to check out his talk, Social Biases in Text Representations and their Mitigation, there! How would you feel if the final decision on your job application was made by a natural language processing... Read more
Exploring Natural Language Processing: Two Ways You Can Leverage Corpus Analysis
Corpus analysis is a technique widely used by data scientists because it provides understanding of a document collection and provides insights about the text.  It’s an apt methodology to consider as we came upon Charles Dickens’ 210th birthday earlier this year because of how frequently passages... Read more
Using NLP to identify Adverse Drug Events (ADEs)
An adverse drug event (ADE) is defined as harm experienced by a patient as a result of exposure to a medication. A significant amount of information about drug-related safety issues such as adverse effects is published in medical case reports that usually can only be explored by... Read more
Intro to NLP: Topic Modeling and Text Categorization
Editor’s note: Sanghamitra Deb is a speaker for ODSC East 2022. Be sure to check out her talk, “Intro to NLP: Text Categorization and Topic Modeling,” there! Natural Language Processing (NLP) is the basis of machine intelligence. NLP is the process of bringing structure to free-form... Read more
DO Repeat Yourself: Designing Open-Source Libraries for Modern Machine Learning
Editor’s Note: Patrick is a speaker for ODSC East 2022 this April 19th-21st. Be sure to check out his talk,  Transformers &  Datasets for Research and Production, there! “Don’t repeat yourself”, or DRY, is a well-known principle of software development. The principle originates from “The pragmatic programmer”,... Read more
Model Overload — Which NLP Model Should I Choose?
As I’m writing this, the model library on Huggingface consists of 11,256 models, and by the time you’re reading this, this number will only have increased. With so many models to choose from, it is no wonder that many get overwhelmed and don’t know any more which model... Read more
8 Ways to Perform NLP Better in 2022
A lot goes into NLP. Languages, dialects, unstructured data, and unique business needs all contribute to requiring constant innovation from the field. Going beyond NLP platforms and skills alone, having expertise in novel processes and staying afoot in the latest research are becoming pivotal for effective... Read more
The Evolution of AI Emotion and Sentiment Analysis
Artificial intelligence emotion and sentiment analysis has come a long way over the years and is on track to revolutionize the AIs of the future. Some wonder if it can ever truly understand human emotions, but computer scientists are focusing on training AI to recognize these... Read more
What Can Go Wrong When Creating Data to Enable Multilingual AI 
Editor’s note: Olga is a speaker for ODSC East 2022! Be sure to check out her talk, “Creating Data to Enable Multilingual AI: What Can Go Wrong and Ways to Mitigate It,” there! Artificial intelligence (AI), and conversational AI as one of the fastest-growing sub-domains within... Read more