Bias | Lamiroy Consulting Blog

In October 2017, we published an article on how legal Artificial Intelligence systems had turned out to be as biased as we are. One of the cases that had made headlines was the COMPAS system, which is risk assessment software that is used to predict the likelihood of somebody being repeat offender. It turned out the system had a double racial bias, one in favour of white defendants, and one against black defendants.

To this day, the problems persist. By now, other cases have come to light. Similar to the problems with the COMPAS system, e.g., algorithms used in Kentucky for cash bail applications consistently preferred white defendants. The situation is similar in the UK, where a committee concluded that bias and inaccuracy render artificial intelligence (AI) algorithmic criminal justice tools unsuitable for assessing risk when making decisions on whether to imprison people or release them. Algorithmic bias was also discovered in systems to rank teachers, and for natural language processing. In the latter, there was a racial bias with regard to hate speech, as well as a gender bias in general.

To research and address the problems with Artificial Intelligence, the ‘AI Now Institute’ was created. Bias is one of the four areas they specifically focus on. They found that bias may exist in all sorts of services and products. A key challenge we face in addressing the problems is that “crucial stakeholders, including the companies that develop and apply machine learning systems and government regulators, show little interest in monitoring and limiting algorithmic bias. Financial and technology companies use all sorts of mathematical models and aren’t transparent about how they operate.”

So, what is algorithmic bias? The Wikipedia defines it as “systematic and repeatable errors in a computer system that create unfair outcomes, such as privileging one arbitrary group of users over others. Bias can emerge due to many factors, including but not limited to the design of the algorithm or the unintended or unanticipated use or decisions relating to the way data is coded, collected, selected or used to train the algorithm.”

The AI Now Institute clarifies that artificial intelligence systems learn from data sets, and that those data sets reflect the social, historical and political conditions in which they were created. As such, they reflect existing biases.

It may be useful to make a distinction between different types of algorithmic bias. Eight different types have been identified thus far:

Sample bias is the most common form of bias. It is when the samples used for the data sets are themselves contaminated with existing biases. The examples given above are all cases of sample bias.
Prejudice bias is one of the causes of sample bias. Prejudice occurs as a result of cultural stereotypes in the people involved in the process. A good example of this are the New York Police Department’s stop and frisk practices. In approximately 83 percent of the cases, the person who was stopped was either African American or Hispanic, where both groups combined only make up just over half of the population. An AI system that learns form a data set like that will inherit the human racial bias that thinks people are more likely suspicious if they’re African American or Hispanic. So, because of prejudice, factors like social class, race, nationality, religion, and gender can creep into the model, and completely skew the results.
Confirmation bias is another possible cause for sample bias. Confirmation bias is the tendency to give preference to information that confirm one’s existing beliefs. If AI systems are used to confirm certain hypotheses, the people selecting the data may – even subconsciously – be inclined to select the data in function of the hypothesis they’re trying to prove.
Group Attribution Bias is the type of bias where the data set contains an asymmetric view of a certain group. An example for that was Amazon’s AI assistant for the Human Resources department. Because Amazon had far more male engineers working for them than female engineers, the system concluded that male engineers had to be given preference over female engineers.
The Square Peg Bias has to do with selecting a data set that is not representative and is chosen because it just happens to be available. It is also known as the availability bias.
The Bias-variance Trade-off. This is a bias that is introduced to the system by mathematically over-correcting for variance. (An example to clarify: Say you have a data set where 30% of the people involved are female. Therefore, females are effectively underrepresented in your data set. To compensate you use mathematical formulas to ‘correct’ the results). This mathematical correction can introduce new biases, especially in more complex data sets, where the corrections could lead to missing certain complexities.
Measurement Bias has to do with technical flaws that contaminate the data set. Say you want to weigh people and use scales, but they’re not measuring correctly.
Stereotype Bias. The example given above with Amazon also qualifies as a gender stereotype bias. There are more male engineers than female engineers. That may lead systems to favour male engineers, and/or to preserve the ratio existing in the data set.

The good news is that as we are getting better at understanding and identifying the different types of algorithmic bias, we also are getting better at finding solutions to counteract them.

Sources:

Justice is blind, but legal AI may be biased.

Like many advanced technologies, artificial intelligence (AI) comes with its advantages and disadvantages. Some of the potentially negative aspects of AI regularly make headlines. There is a fear that humans could be replaced by AI, and that AI might take our jobs. (As pointed out in a previous article, lawyers are less at risk of such a scenario: AI would perform certain tasks, but not take jobs, as only 23% of the work lawyers do can be automated at present). Others, like Elon Musk, predict doomsday scenarios if we start using AI in weapons or warfare. And there could indeed be a problem there: what if armed robotic soldiers are hacked, or have bad code and go rogue? Some predict that superintelligence (where AI systems become vastly more intelligent than human beings) and the singularity (i.e. the moment when AI systems become self-aware) are inevitable. The combination of both would lead to humans being the inferior species, and possibly being wiped out.

John Giannandrea, who leads AI at Google, does not believe these are the real problems with AI. He sees another problem, and it happens to be one that is very relevant to lawyers. He is worried about intelligent systems learning human prejudices. “The real safety question, if you want to call it that, is that if we give these systems biased data, they will be biased,” Giannandrea said.

The case that comes to mind is COMPAS, which is risk assessment software that is used to predict the likelihood of somebody being repeat offender. It is often used in criminal cases in the US by judges and parole boards. ProPublica is a Pulitzer Prize winning non-profit news organization. It decided to analyse how correct COMPAS was in its predictions. They discovered that COMPAS’ algorithms correctly predicted recidivism for black and white defendants at roughly the same rate. But when the algorithms were wrong, they were wrong in different ways for each race. African American defendants were almost twice as likely to be labelled a higher risk where they did not actually re-offend. And for Caucasian defendants the opposite mistake was made: they were more likely to be labelled lower risk by the software, while in reality they did re-offend. In other words, ProPublica discovered a double bias in COMPAS, one in favour of white defendants, and one against black defendants. (Note that COMPAS disputes those findings and argues the data were misinterpreted).

The problem of bias in AI is real. AI is being used in more and more industries, like housing, education, employment, medicine and law. Some experts are warning that algorithmic bias is already pervasive in many industries, and that almost no one is making an effort to identify or correct it. “It’s important that we be transparent about the training data that we are using, and are looking for hidden biases in it, otherwise we are building biased systems,” Giannandrea added.

Giannandrea correctly points out that the underlying problem is a problem of lack of transparency in the algorithms that are being used. “Many of the most powerful emerging machine-learning techniques are so complex and opaque in their workings that they defy careful examination.”

Apart of all the ethical implications, the fact that it is unclear how the algorithms come to a specific conclusion could have legal implications. The U.S. Supreme Court might soon take up the case of a Wisconsin convict who claims his right to due process was violated when the judge who sentenced him consulted COMPAS. The argument used by the defence is that the workings of the system were opaque to the defendant, making it impossible to know for what arguments a defence had to be built.

To address these problems, a new institute, the AI Now Institute (ainowinstitute.org) was founded. It produces interdisciplinary research on the social implications of artificial intelligence and acts as a hub for the emerging field focused on these issues. Their main mission consists of “Researching the social implications of artificial intelligence now to ensure a more equitable future.” They want to make sure that AI systems are sensitive and responsive to the complex social domains in which they are applied. To that end, we will need to develop new ways to measure, audit, analyse, and improve them.

Sources:

Lamiroy Consulting Blog

Tag Archives: Bias

Legal AI is still biased in 2019

Legal AI and Bias

Insights on Legal IT, CMS and KMW