Editor’s note: Patrick Hall is a speaker for ODSC East 2022 this April 19th-21st. Be sure to check out his talk, “A Tutorial on Machine Learning Model Governance,” there!
Data science has a science problem. A lot of data science seems much more aligned with cargo cult science, a phrase coined by Richard Feynman to describe an endeavor with the trappings and appearance of science, but at its core, is pseudoscience or snake oil. Cargo cult science feels too familiar to me. Sure, we use computers, look at numbers, write down equations, and call ourselves scientists, but an anti-pattern has become so common for me I’ve come up with a heuristic to describe it – the data scientific method. Here’s how we apply the data scientific method to a project:
- Assume we’ll make millions of dollars.
- Install GPU, download Python.
- Collect inaccurate, biased data from the internet or the exhaust of some business process.
- Surrender to confirmation bias:
Study collected data to form a hypothesis (i.e., which X, y, and ML algorithm to use).
Use essentially the same data from hypothesis generation to test our hypothesis.
Test our hypothesis with a high-capacity learning algorithm that can fit almost any set of loosely correlated X and y well.
Change our hypothesis until our results are “good.”
- Don’t worry about reproducing; we’re all good, bruh.
Sound familiar? Failure is nearly assured when building products in this manner, and many projects I see are just the productization of confirmation bias. Why wouldn’t anyone care about this? Here’s the uncomfortable truth. It’s because many organizations don’t depend on AI for mission-critical activities. They engage in data “science” and are happy to profit from the buzz around the technology, whether it works or not. While this is faster, easier, and more fun than doing the tedious hard work of scientific research, it’s not sustainable. Broken “AI” is hurting people, and the hype Ponzi scheme will run out of steam eventually. (Does the phrase “AI Winter” ring a bell?)
I acknowledge (and congratulate!) the organizations that have put in place substantial test, evaluation, validation, and verification (TEVV) processes for their AI systems. But TEVV is not a fulsome application of the scientific method either. TEVV is an engineering concept applied post-hoc to correct design mistakes. The scientific method urges comprehensive design thinking through rigorous experimental design, hypothesis generation, and hypothesis testing. This is how the scientific method might look for a data science project:
- Develop a credible hunch (e.g., based on prior experiments or literature review).
- Record our hypothesis (i.e., the intended real-world outcome of our AI system).
- Collect data (e.g., using design of experiment).
- Test the hypothesis that the AI system has the intended effect (e.g., using a double-blind random construct).
We’re not doing this today. My experience and the frequency of publicly-recorded AI incidents tell me that many “AI” projects are applying the data scientific method, not the scientific method. Is it a coincidence that we’re hearing so much about AI’s fairness, accountability, and transparency issues? I doubt it. I think there’s an obvious connection between “moving fast and breaking things,” black-boxes, and confirmation bias and sociological bias incidents in AI systems. Some of the most significant steps practitioners and their management could take to address these issues are to enforce strong TEVV policies and take steps toward actual scientific research.
If you’d like to learn more about these topics, I’ll be delivering a training session at ODSC East 2022, titled “A Tutorial on Machine Learning Model Governance.
Patrick Hall is a principal scientist at BNH.AI, where he advises Fortune 500 clients on matters of AI risk and conducts research on AI risk management in support of NIST’s efforts on trustworthy AI and technical AI standards. He also serves as visiting faculty in the Department of Decision Sciences at The George Washington School of Business, teaching classes on data ethics, machine learning, and the responsible use thereof.
Prior to co-founding BNH, Patrick led H2O.ai’s efforts in responsible AI, resulting in one of the world’s first commercial solutions for explainable and fair machine learning. He also held global customer-facing roles and R&D research roles at SAS Institute. Patrick studied computational chemistry at the University of Illinois before graduating from the Institute for Advanced Analytics at North Carolina State University.
Patrick’s technical work has been profiled in Fortune, Wired, InfoWorld, TechCrunch, and others. An ardent writer himself, Patrick has contributed pieces to outlets like McKinsey.com, O’Reilly Ideas, Thompson-Reuters Regulatory Intelligence, and he is the lead author for the forthcoming book, Machine Learning for High-Risk Applications.