After major incidents, such as the Cambridge Analytica scandal and the alleged racial bias in the COMPAS system that assessed potential recidivism risk in the US, the call for responsible data science & AI frameworks increased. Books as weapons of math destruction, the black box society, automating inequality, and against prediction also helped to create awareness about (unintentional) adverse effects of AI and data science systems. In the last year, interest in the ethical risks of data science has peaked: the Gartner 2020 hype cycle even mentions `responsible AI’ as an emerging technology. Indeed, the major technology companies have published ethical guidelines on the use of data science and AI. In addition, governance bodies and governments have proposed high-level principles. The Ethics guidelines for trustworthy AI by the high-level expert group on AI is an important and leading example in Europe. The Dutch Central Bank has also published a discussion paper containing guidelines that are specific to the financial sector.
However, operationalizing the proposed frameworks in a corporate organizational structure seems to be challenging. The high-level principles stated in these frameworks often contain items as `transparency’, ‘explainability’, and `fairness’. Such principles are, of course, very sympathetic. However, there is no free lunch. Imposing, for example, `fairness’ will, in general, lead to AI and data science systems that are less accurate as compared to their unrestricted versions. At many companies, there will not be an explicit process or policy available that facilitates decisions on such trade-offs. In fact, we suspect that data science departments often make such decisions themselves, while it could be argued that actually organization-wide strategic decision-making is required.
Zooming in on the `fairness’ principle, we see that the implementation in a data science workflow is, in fact, rather subtle. `Thou shalt not discriminate’ is a popular saying. However, discriminating between customers is often the main purpose of a data science system! Take loan applications as an example. On the basis of the data of the applicant, models assess the creditworthiness. Applications with a low value are not accepted or an additional risk premium has to be paid. The question arises which variables might be used in models and how we should quantify (un)fairness of a system. The academic literature has proposed several (mathematical) definitions, which are (in general) incompatible: you can only satisfy one measure of fairness and you will automatically violate the others. As the anti-discrimination laws (in Europe) are not very explicit, it is very complicated to ensure that your data science fully complies with anti-discrimination laws. Within a company, the question arises about which department should be accountable and responsible for such translations. In addition, which departments have the required competencies? If one would like to incorporate additional ethical constraints, the situation becomes even more complex. What methods and processes can we use to identify if there are groups that need to be `protected’ against discrimination? How do we ensure company-wide consistency in choices? Which departments should be involved in this process?
As a financial institution, de Volksbank is familiar with governance frameworks for quantitative modeling. Building on this knowledge, and in line with the social mission of de Volksbank, we have developed the first version of a framework for the responsible use of data science and AI by the data science departments of de Volksbank. In our talk at ODSC Europe, “Responsible Data Science Using Bias-Dashboards,” we will discuss this framework. The talk consists of three parts. First, we will discuss some examples that demonstrate the need for responsible data science frameworks and provide an overview of proposed frameworks. As fairness is one of the most important principles and is rather difficult to implement, we next take a deep-dive into fairness. We discuss the most popular definitions of fairness and their interrelations. This discussion is accompanied by an overview and demo of selected open-source (Python) packages for detecting unfairness (i.e. `bias dashboarding’). We will also shortly discuss methods that have been developed to obtain `fairness-by-design’ or are able to repair a biased model. Thirdly, we focus on operationalization aspects. The question of operationalization, we argue, brings with it several subsequent questions. For one, how should certain principles and values be interpreted and defined in a data science context? Who decides what is the correct interpretation of for example ‘fairness’ and, given that fairness definitions can be mutually exclusive, how should fairness be quantified? We look forward to discussing these questions with you at ODSC Europe.
Ramon van den Akker
risk modeling and AI & data science specialist, de Volksbank
associate professor, dept. Econometrics, Tilburg University
AI & data science specialist, de Volksbank
AI & ethics specialist, de Volksbank
PhD candidate, School of Philosophy, Erasmus University Rotterdam