8 More Methods for Better Machine Learning at ODSC West 8 More Methods for Better Machine Learning at ODSC West
Many companies are now utilizing data science and machine learning, but there’s still a lot of room for improvement in terms... 8 More Methods for Better Machine Learning at ODSC West

Many companies are now utilizing data science and machine learning, but there’s still a lot of room for improvement in terms of ROI. A 2021 VentureBeat analysis suggests that 87% of AI models never make it to a production environment and an MIT Sloan Management Review article found that 70% of companies reported minimal impact from AI projects. Yet despite these difficulties, Gartner forecasts investment in artificial intelligence to reach an unprecedented $62.5 billion in 2022, an increase of 21.3% from 2021.

Nevertheless, we are still left with the question: How can we do machine learning better? To find out, we’ve taken some of the upcoming tutorials and workshops from ODSC West 2022 and let the experts via their topics guide us toward building better machine learning.

Healthcare Applications and Real-Time Feedback

Heart rate variability biofeedback (HRV-B) is a clinically effective therapy in which patients can improve their mental and physical well-being through real-time monitoring of their heart rate and specialized breathing techniques thanks to machine learning. The effectiveness of HRV-B in treating these conditions is due to how it modulates the nervous system connections linking the brain and heart, particularly the baroreflex. HRV-B is now entering digital health and wellness; however, traditional metrics and algorithms were designed for research or in-person clinical care.

Session: Scalable, Real-Time Heart Rate Variability Biofeedback for Precision Health: A Novel Algorithmic Approach | Kirstin Aschbacher, PhD | Head of Health Data Science | Meru Health

Learning from Poisoned Data

Data poisoning is one of the main threats to AI systems. When malicious actors have even limited control over the data used for training a model, they can try to fail the training process, prevent it from convergence, skewing the model, or install so-called ML backdoors – areas where this model makes incorrect decisions, usually areas of interest for the attacker. This threat is especially applicable when security technologies use anomaly detection mechanisms on top of a normality model constructed from previously seen traffic data.

Session: AI in a Minefield: Learning from Poisoned Data | Johnathan Azaria | Data Scientist Tech Lead | Imperva


Categorical Structures

Often, categorical variables possess a natural structure that is not linear or ordinal in nature. The months of the year have a circular structure while the US states have a structure that can be represented by a graph. StructureBoost uses novel techniques that allow this known structure to be exploited to yield better predictions. Recently, StructureBoost has been enhancing the structure in the target variable (i.e. in multi-classification) as well as in the predictor variables.

Session: StructureBoost: Gradient Boosting with Categorical Structure | Brian Lucena | Principal | Numeristical

Integrating Complex Business Requirements

Regardless of their concrete application, the primary goal of forecasting systems is to produce the most accurate forecast possible. However, while beating benchmarks is important, a forecast useable in business processes additionally needs to fulfill many more criteria, which significantly increases the complexity of real-world solutions. The business requirements can vary depending on both the type of forecast and its goal.

Session: Any Way You Want It: Integrating Complex Business Requirements into ML Forecasting Systems | David Koll | Senior Data Scientist | Continental AG

Python is Still King

Python is becoming the leading programming language for data analysis. Its endless capabilities along with its fairly simple syntax had made it the way to go for data analysis. Python provides various libraries (also known as packages)for analyzing various types of data, with tabular data being the most common among them.

Session: Introduction to Python for Data Analysis | Leonidas Souliotis, PhD | Senior Data Scientist | AstraZeneca

Knowing the Potential of AI

Modern artificial intelligence, and deep learning in particular, is extremely capable of learning predictive models from vast amounts of data. The expectation of many AI researchers as well as the general public is that AI will go from powering customer service chatbots to providing mental health services. That it will go from personalized advertisement to deciding who is given bail. That it will go from speech recognition to writing laws. The expectation is that AI will solve society’s problems by simply being more intelligent than we are.

Session: Artificial Intelligence Can Learn from Data. But Can It Learn to Reason? | Guy Van den Broeck, PhD | Director,  Associate Professor | StarAI (Statistical and Relational Artificial Intelligence Lab), UCLA

Detecting Change Over Time

Did my data change after a certain intervention? This is a common question with data observed over time. Classical statistical and engineering approaches include control charts to see if the series falls outside of the normal boundaries of expected data. A Bayesian approach to this problem calculates the probability that the data series changes at every point along the series. Bayesian change point analysis allows the analyst to evaluate a whole series and look where the highest probability of change occurred.

Session: Detecting Changes Over Time with Bayesian Change Point Analysis in R | Aric LaBarr, PhD | Associate Professor of Analytics | Institute for Advanced Analytics at NC State University

Fighting Customer Churn

Customer churn (cancellation) is the bane of all businesses with recurring revenue, and data can help understand the causes of churn and take action to reduce it. But there are many misconceptions about churn and pitfalls in actually using data-driven insights to reduce churn. The foundation to fight churn with data requires creating a library of customer metrics. Customer metrics are used as features for machine learning algorithms and by themselves can be used to define segments for data driven churn reducing interventions.

Session: Fighting Churn With Data | Carl Gold, PhD | Director of Data Science | Migo

Learn Better Methods For Better Machine Learning at ODSC West 2022

To dive deeper into these topics, join us at ODSC West 2022 this November 1st to 3rd, either in-person or virtually.  The conference will also feature hands-on training sessions in focus areas, such as machine learning, deep learning, MLOps and data engineering, responsible AI, and more. What’s more, you can extend your immersive training to 4 days with a Mini-Bootcamp Pass. Check out all of our types of passes here.



ODSC gathers the attendees, presenters, and companies that are shaping the present and future of data science and AI. ODSC hosts one of the largest gatherings of professional data scientists with major conferences in USA, Europe, and Asia.