In October 2017, we published an article on how legal Artificial Intelligence systems had turned out to be as biased as we are. One of the cases that had made headlines was the COMPAS system, which is risk assessment software that is used to predict the likelihood of somebody being repeat offender. It turned out the system had a double racial bias, one in favour of white defendants, and one against black defendants.
To this day, the problems persist. By now, other cases have come to light. Similar to the problems with the COMPAS system, e.g., algorithms used in Kentucky for cash bail applications consistently preferred white defendants. The situation is similar in the UK, where a committee concluded that bias and inaccuracy render artificial intelligence (AI) algorithmic criminal justice tools unsuitable for assessing risk when making decisions on whether to imprison people or release them. Algorithmic bias was also discovered in systems to rank teachers, and for natural language processing. In the latter, there was a racial bias with regard to hate speech, as well as a gender bias in general.
To research and address the problems with Artificial Intelligence, the ‘AI Now Institute’ was created. Bias is one of the four areas they specifically focus on. They found that bias may exist in all sorts of services and products. A key challenge we face in addressing the problems is that “crucial stakeholders, including the companies that develop and apply machine learning systems and government regulators, show little interest in monitoring and limiting algorithmic bias. Financial and technology companies use all sorts of mathematical models and aren’t transparent about how they operate.”
So, what is algorithmic bias? The Wikipedia defines it as “systematic and repeatable errors in a computer system that create unfair outcomes, such as privileging one arbitrary group of users over others. Bias can emerge due to many factors, including but not limited to the design of the algorithm or the unintended or unanticipated use or decisions relating to the way data is coded, collected, selected or used to train the algorithm.”
The AI Now Institute clarifies that artificial intelligence systems learn from data sets, and that those data sets reflect the social, historical and political conditions in which they were created. As such, they reflect existing biases.
It may be useful to make a distinction between different types of algorithmic bias. Eight different types have been identified thus far:
- Sample bias is the most common form of bias. It is when the samples used for the data sets are themselves contaminated with existing biases. The examples given above are all cases of sample bias.
- Prejudice bias is one of the causes of sample bias. Prejudice occurs as a result of cultural stereotypes in the people involved in the process. A good example of this are the New York Police Department’s stop and frisk practices. In approximately 83 percent of the cases, the person who was stopped was either African American or Hispanic, where both groups combined only make up just over half of the population. An AI system that learns form a data set like that will inherit the human racial bias that thinks people are more likely suspicious if they’re African American or Hispanic. So, because of prejudice, factors like social class, race, nationality, religion, and gender can creep into the model, and completely skew the results.
- Confirmation bias is another possible cause for sample bias. Confirmation bias is the tendency to give preference to information that confirm one’s existing beliefs. If AI systems are used to confirm certain hypotheses, the people selecting the data may – even subconsciously – be inclined to select the data in function of the hypothesis they’re trying to prove.
- Group Attribution Bias is the type of bias where the data set contains an asymmetric view of a certain group. An example for that was Amazon’s AI assistant for the Human Resources department. Because Amazon had far more male engineers working for them than female engineers, the system concluded that male engineers had to be given preference over female engineers.
- The Square Peg Bias has to do with selecting a data set that is not representative and is chosen because it just happens to be available. It is also known as the availability bias.
- The Bias-variance Trade-off. This is a bias that is introduced to the system by mathematically over-correcting for variance. (An example to clarify: Say you have a data set where 30% of the people involved are female. Therefore, females are effectively underrepresented in your data set. To compensate you use mathematical formulas to ‘correct’ the results). This mathematical correction can introduce new biases, especially in more complex data sets, where the corrections could lead to missing certain complexities.
- Measurement Bias has to do with technical flaws that contaminate the data set. Say you want to weigh people and use scales, but they’re not measuring correctly.
- Stereotype Bias. The example given above with Amazon also qualifies as a gender stereotype bias. There are more male engineers than female engineers. That may lead systems to favour male engineers, and/or to preserve the ratio existing in the data set.
The good news is that as we are getting better at understanding and identifying the different types of algorithmic bias, we also are getting better at finding solutions to counteract them.