“F**k the Algorithm” is a bold mantra that we are likely to hear more about as unintended consequences of AI continue to unfold. For almost a decade, our industry has been obsessed with maximizing predictive performance – also seen as the ability with which AI can approximate human-level performance on tasks.
There are mounting examples of unintended consequences – from discriminating against female candidates to prioritizing healthcare of white patients over black patients. Human-centered AI focused on accountability, interpretability, and equity needs to take priority over gains in predictive power.
In data science, definitions are important. There still isn’t unanimous agreement on what it means to build an ethical data science practice. In my experience, ethical data science starts with fairness and values privacy, transparency, and human wellbeing over predictive power.
There are a lot of brilliant minds working on these challenges. Facebook, Google, and Microsoft, among others, have spun up “Responsible AI” teams to tackle these issues, but to limited success. Even with such initiatives, investments in ethical data science beyond these limited groups are rarely incentivized.
Frustrated by the status quo, I (naively) set out to transform our practice around the core principle of ethical data science, and I’ve come to appreciate exactly why this is so complex. While I raise more questions than answers, I hope to share insights from the lessons we learned along the way.
What is fair, anyways?
You can’t have ethical data science without first addressing fairness. Broadly, fairness means treating different individuals equally given the same set of circumstances. Two software engineers working for the same company, in the same city, with similar levels of performance and education would be expected to have the same pay regardless of gender or ethnic background.
Let’s say we’re trying to build a useful model that recommends salary levels for individuals by taking into consideration their educational backgrounds and work experience. In theory this could reduce individual biases involved in making these decisions.
Now, let’s assume we can find a historic dataset of individuals complete with educational histories, work positions, and the accepted salaries. Chances are high that this dataset contains some amount of bias. Also, in many cases ‘proxy variables’ – factors like zip code, personal interests, shopping transactions, among others, can be indirect indicators of protected statuses like age, gender, and ethnicity.
To overcome biased datasets, data scientists have attempted to incorporate measures of fairness into our models. Going back to our salary example, imagine if we only had two groups of people ‘A’ and ‘B’. If 70% of the dataset belonged to group A and 30% belonged to group B, using our definition of fairness above, we would consider our model fair if individuals in both group A and group B had equal salaries. This would be called ‘statistical parity.’
It’s completely unrealistic to expect that the individuals in group A and group B will share all the same characteristics or experiences. Another definition of fairness that may be helpful here is ‘equality of opportunity’. This definition says that members of a certain group that meet certain criteria will experience equal outcomes. So, if 80% of group A have a college education, and 50% of group B have a college education, then our model will be considered fair if the salaries recommended in those subgroups are equal. It may be impossible to satisfy both the criteria for ‘statistical parity’ and ‘equality of opportunity’ definitions of fairness.
This gets complicated quickly. In fact, this tutorial on 21 definitions of fairness demonstrates that different ways of defining fairness may either contradict each other or cannot be simultaneously true. The reality is there is no one-size-fits-all definition of fair. Data scientists are tasked with making important decisions about what questions to ask and which models to use. While many are well intended, this means it can be difficult for someone who wants to do the right thing to know what calls to make.
When you consider data science currently ranks the lowest in diversity among tech fields, this becomes even more problematic. So, if the definition of ‘fair’ is subjective with our teams being mostly homogenous, how do we expect them to make fair decisions, and therefore practice ethical data science?
What gets measured gets managed, and what gets managed improves
Our team lives and dies by our core values. We hire by them, evaluate by them, and celebrate them on a regular basis. We’ve identified our values through collective soul searching, but it comes down to the behavioral traits that have made members on our team more successful in our growing organization. Navigating Ethics is one of our core values, which acknowledges that, sometimes, there may not be one right answer.
As a way of daily reinforcing and celebrating our core values during the pandemic, we started using an internal tool – Bonusly – where team members give each other points and shout-outs for living our core values, which can be traded in for gift cards or cash. When we noticed that Navigate Ethics was consistently one of the least acknowledged, this allowed us to start the conversation.
When asking the team why we didn’t celebrate Navigate Ethics as often as our other core values, I got a variation of the following responses. And, by the way, almost every organization I’ve spoken with has echoed similar sentiments:
- There are no regular opportunities to discuss the nuances of ethical data science.
- We’re not building solutions that deal with life-or-death situations.
- I don’t feel confident enough in my ability to spot unintended consequences.
- I don’t have a strong enough grasp on equity and fairness, especially when it comes to different groups.
Most organizations have KPIs for things like on-time completion, profitability, etc. In some cases where risk is high, risk and adverse event occurrence may be included as KPIs. But how many organizations count the number of times meaningful dialogue around ethics has occurred?
As we’ve spent more and more time having these open conversations, our team has refined its capacity to understand what it means to practice ethical data science. We now feel more confident discussing these issues and recognizing when we need to reach outside of our limited circles. KPIs that incentivize conversations like these can help other organizations do the same.
Taking a page out of regulated industries
Data science is not new, but over the last 10 years, most professionals would consider it as an emerging discipline. It wasn’t until 2018 that a critical mass of higher education institutions offered formal degrees in data science. We can all agree that the rapid rate of change in data science means that its advancements are light years ahead of regulation and often require redefining standards.
For those of you who work in healthcare or biomedical research, the term Institutional Review Board (IRB) is a familiar one. An IRB is a formally designated group within an organization tasked with protecting the rights and welfare of humans participating as subjects in the research. Even a study is as benign as a retrospective review of historical medical records, must be reviewed by the IRB and designated as ‘limited to no risk’.
Although Data science is still largely unregulated, there is growing consensus that it is necessary. For example, the FDA has recently ruled that AI-enabled software in healthcare is considered a medical device and subject to the same regulations. This is an early sign of regulation coming to the data science world.
Most data science teams are not in the business of building self-driving cars or building algorithms that decide the fate of loan applications. But we are building data and AI systems that measure, influence, and support humans in various ways. What if we viewed each solution that we built through the lens of the impact it has on humans? A standard ethical review process can empower our teams to recognize these risks and safeguard against unintended consequences.
Where do we go from here?
For our organization, addressing ethical data science needed to start with our talent pipeline. We want to see if candidates understand that there may not always be a right answer and that it takes reaching outside our limited circles to get there. We’ve also helped several of our clients do the same. After more than 40 candidates interviewed and 15 placed in the last 2 years, we are seeing a greater representation of female and BIPOC candidates. These values are resonating, and the richness of our conversations and perspectives is evolving.
I’ve learned that ethics is neither a checklist nor something you can write on a wall, hoping that others will follow. You cannot command it into existence, but you can facilitate, measure, and incentivize the conversations that need to take place. We must encourage our data science teams to engage external stakeholders in conversation and collaboration to ensure that ethical data science practices are grounded in diversity, equity, and inclusion. It’s only through these conversations that we will make progress.
The best that we can do right now is acknowledge that ethical data science is a process and not a checklist. While we currently have more questions than answers, this is where innovative solutions can be found. Our journey at Pandata building an ethical practice is just getting started, but I’m convinced that it starts with culture.
The breadth of content that exists about ethical data science can be overwhelming. I hope this hasn’t scared you away. The good news is that there is so much to do and everyone has an opportunity to contribute and move ethical data science forward. If you’re looking to learn a little more, I recommend starting with these guides:
- ODSC East 2021 talk, “Building an Ethical Data Science Practice“
About the author:
Cal Al-Dhubaib is a data scientist, entrepreneur, and professional speaker on AI topics. He founded Pandata on the core values of “Approachability and Ethics”. Empowering organizations to plan, build, and scale AI solutions that grow their bottom line, Pandata has overseen 80+ transformative analytics projects with leading global brands including Parker Hannifin, the Cleveland Museum of Art, FirstEnergy, and Penn State University. Cal is especially passionate about the ethics of AI and how organizations can orchestrate the right talent to support AI initiatives. Cal has been recognized as a Notable Immigrant Entrepreneur, Crain’s Cleveland 20 in their 20s, and a two-time Cleveland Smart 50 recipient. In addition to becoming the first data science graduate from Case Western Reserve University, Cal is also known for his role in advocating for careers and educational pathways in Data Science through workforce development initiatives.