Algorithms Are Fraught with Bias. Is There a Fix?

November 19, 2018

Data Analyst at the Johns Hopkins University Center for Government Excellence

A display shows a facial recognition system for law enforcement during a technology conference in Washington, D.C.

Photo: Saul Loeb/AFP/Getty Images

Today, algorithms are used in a range of fields for a multitude of tasks. Through a series of equations, they predict outcomes, attempt to understand patterns, detect anomalies, and so on. Algorithms recommend programs to us on television, decide which advertisements to show us, and decide which content to show us on social media. They are used to make health care decisions for us, they decide who and who not to hire based on the words within our job applications, and they accept or reject our mortgage applications.

Algorithms are also deployed by our government: They have been used to “optimize” public food assistance, detect faces in the street, predict future criminals, and decide who should remain in prison. These applications are the most pressing and call on us to reevaluate what fairness and justice might mean in the era of the algorithm. The application of AI mechanisms through government has impacted and will continue to impact millions of people—more and more each day.

Governments and adjacent organizations naturally want to do more with smaller pools of resources to save time, work more efficiently and remain competitive and on par with the culture at large. For all of these desires, algorithms may seem like the perfect solution. However, for all the simplicity of implementing an algorithm, what is often omitted is a pause for reflection, skepticism, and iteration.

What if your algorithm doesn’t perform accurately in the real world because the conditions and assumptions underlying its creation are different from the conditions of the world into which it was released? What if there are unforeseen consequences? What if your AI product isn’t perfect? Well, I have news for you: It isn’t.

Your algorithms are not perfect, and that’s the truth. They should be better. And, thankfully, they can be.

All Algorithms Have Bias

I am a member of the team that helped to create the Ethics & Algorithms Toolkit, a plug-and-play tool that helps guide practitioners through conversations on algorithm use. When readers open the tool kit, they are met with our first foundational assumption, off of which we base the rest of the tool: All data have bias, all people have bias, and therefore all algorithms have bias.

“Garbage in, garbage out,” goes the old computer science maxim. In the case of algorithms, “bias in, bias out.” What any predictive analysis does is hold a mirror to the past. Analyses that use data that touches on race are a particularly illustrative example: A racially unequal past is likely to produce racially unequal outputs. If the thing we want to predict happened more frequently to certain types of people than other types in the past, predictive analysis will project it more frequently for those people in the future.

Still, neither distorting the predictive mirror—by using some complex intra-algorithm weighting mechanism, for example—nor tossing it out altogether is the correct path forward. “If the image in the predictive mirror is jarring, the answer is not to just bend it to our liking,” writes Sandra G. Mayson of the University of Georgia School of Law. “To reject algorithms in favor of judicial risk assessment is to discard the precise mirror for the cloudy one. It does not eliminate disparity, it merely turns a blind eye.”

The Data Must Be Accurate

Let’s think about an algorithm that wants to predict future crime, for example. Unless we know the actual offending rates—meaning all crimes committed, not just the ones that we know about—configuring data or an algorithm to reflect a statistical scenario that we prefer only distorts the prediction further, so it neither reflects the data nor any demonstrable reality. Limiting input data does not eliminate disparity; it cripples the predictive tool.

An ethical conversation about algorithms must be inclusive. Fortunately, holding such a conversation might now be easier than ever.

So, what do we do? In the case of the example above, we either use data that fairly represents incidents, or there is no basis for predicting those events at all. But more broadly, we must be active rather than passive and tackle algorithms and their risks early on. Rather than wait for the next harmful outcome to inevitably arise, we should assume that one is coming—and soon–and work around the clock to control for that. If we do this, we will be headed in the right direction.

Above all else, an ethical conversation on algorithms must be inclusive. Fortunately, holding such a conversation might now be easier than ever.

Different Types of Bias

The team had some difficult conversations: whether to define older data as being historically biased; whether global benefit outweighs the impact of individual harm; whether using algorithms that are difficult or impossible to interpret is unethical; and whether intentional and unintentional outcomes should be weighted the same.

One of the most pressing of these conversations was that of bias and how to define it. Bias is a strong word—potentially a deterrent. No one wants to admit that their data might be biased. At first, we weren’t even sure if we should use the word.

Ultimately, during the tool kit’s construction, we chose to split bias into two categories: that which is technical, and that which is deeply and historically rooted in social circumstances. Technical bias focused on data accuracy and representativeness; historical bias is defined as data that is impacted by or entrenched in issues of discrimination, legacy, unfair policy, and so on. The decision to split historical bias out from technical bias was made so that readers of the tool kit would be forced to stop and carefully consider the historical nature of their data.

Is it even necessary to fork bias into multiple types? Our answer was yes. We wanted to steer the conversation in that direction; had we packaged bias any differently, readers might not engage with that conversation—and they should.

Adopt a New Norm

We must call on one another to produce and adopt transparent, responsible, and accountable algorithms. To do this, I would encourage you to start by inciting conversation on equity and ethics in algorithms. Open up the processes of creation, maintenance, and implementation for these algorithms so that that the conversations are iterative, thoughtful, inclusive, broad, and held by stakeholders of all classes, colors, and motivations.

When working on a potentially life-changing algorithm, err on the side of caution, open the conversation to a broader set of perspectives, and evaluate work as a group. No more shirking responsibility. We are all responsible for this conversation.

Algorithms Are Fraught with Bias. Is There a Fix?

All Algorithms Have Bias

The Data Must Be Accurate

Different Types of Bias

Adopt a New Norm

Miriam McKinney

Get ahead in a rapidly changing world.

Algorithms Are Fraught with Bias. Is There a Fix?

All Algorithms Have Bias

The Data Must Be Accurate

Different Types of Bias

Adopt a New Norm

Miriam McKinney

Get ahead in a rapidly changing world.

Related Stories

ChatGPT-Style Tech Is About to Change Our World

Market Update: How Is the Global Economy Shaping Up?

Section 230 and How the Supreme Court Could Upend the Internet

The Rise of Climate Fintech