How Bias Influences Outcomes
Algorithmic bias is a popular topic, though bias comes in many forms. Understanding them and taking appropriate action helps minimize them.
As AI continues to burrow its way deeper into enterprises, there’s concern about algorithmic bias, particularly as it relates to fairness. However, algorithmic bias doesn’t just happen on its own.
“There are [three] questions a business has to ask when they’re building a product or service: Which decisions matter more? Where would we rather be wrong? With whom would we rather be wrong?” says Rayid Ghani, professor of AI at Carnegie Mellon University.
Why not just eliminate all bias all together and be done with it? Good luck, because humans are the root cause of bias.
“Step one is thinking about what is the unwanted bias? Step two is how do I measure it? Step three is audit. Before I fix it, I want to figure out where it came from because if I don’t know where it came from, I don’t know how to fix it,” says Ghani. “It’s not data, it’s not the algorithm, it’s the human process generating data.”
Popular Forms of Bias in Business
Two of the most common types of bias in business are confirmation bias and selection bias.
With confirmation bias, a person wants the data to support or confirm their point of view. An example of this: CEOs rejecting analytical outcomes and telling data scientists what the data must say instead or others “torturing the data” to get the outcome they want.
Selection bias is sometimes called “cherry-picking,” though the broader issue is missing data, because selection bias is not always a conscious act. Specifically, there can be missing data that impact outcomes.
“I spend a lot of time thinking about what’s not measured. I think omitted-variable bias is the perennial problem of data scientists,” says Ryan Sloan, senior data scientist at HR AI platform provider Textio in an email interview. “Often, I'm working with observational data, but even when I’m conducting a proper randomized experiment, I’m doing so within the existing data context. It requires careful consideration about what features are missing. I think a common trap is to look at all the data available, get excited, and jump into exploring a relationship between features, assigning effects to the wrong variables.”
To address the issue, he begins upstream by asking: What real relationships might be captured here? Which of those are confounders? What relationships could be critical but missing?
“This is an activity I like to do with partners and subject-matter experts whenever possible. The process of drawing out a causal diagram can clearly ground the problem in the kinds of questions that can be answered,” says Sloan. “It can help identify controls, which may require more data collection, or instrumental variables. It’s a useful step even when my aim isn't causal inference: It improves our model of the system to be measured."
The problem with missing data is that something is over- or underrepresented.
Ryan Sloan, Textio
“As a data scientist, I always have to be a little skeptical of the data on hand. Generative AI has brought this to the forefront for me because we’re now reproducing patterns from training data at consumer scale, and sometimes those patterns reveal the worst, whether it’s in who’s excluded or the stereotypes that are reinforced,” says Sloan. “The main thing I try to hammer home is that we have to put on a skeptic’s hat and ask some critical questions before we charge ahead. Where did the data come from? How well does this source line up with my requirements? Is the model tuned and designed to estimate factual results, or generate plausible ones? How will I know if it’s right?”
Gary Rozal, principal analyst and data scientist at mattress company Saatva, says there are at least a dozen types of bias that occur during the process of data collection, analyses, and insight generation that result in inferior information for decision-making and produce unfavorable business outcomes. He approaches bias from an end-to-end perspective, and uses the following as a checklist:
Hypotheses formulation: A hypothesis with poor wording or intentional confirmation bias, regardless of data, may lead to one conclusion with no opposing viewpoint or a possibility of contradicting results.
“A test is to write the negation of the hypothesis and find out if the negated form is still acceptable as a hypothesis,” says Rozal in an email interview. “For example, an initial hypothesis -- it will rain tomorrow, and its negation -- it will not rain tomorrow. If both seem acceptable as hypotheses, then they may be unbiased.”
Sample selection: Because it is not always possible to collect data from every member of a population, it is practical to collect the sample data only from a representative sample of a target population.
“It is critical that the sample data be collected from members who are randomly selected from the population,” says Rozal.
Data processing: Bias can be introduced as the investigator begins to analyze the data during the data hygiene process. Some data points may be discarded as “bad” data when, in fact, these counterintuitive data may be the most significant source of information.
Choice of methodology and analyses: Some statistical methods favor specific outcomes. For example, the standard regression models are, in theory (and in layman’s terms), an average.
“[W]hile regression models will provide an unbiased summary or prediction they fail to predict rare or extreme events like catastrophic failures. In these use cases, introducing informed bias and tacit knowledge may produce better information. At this point, not all bias is wrong, as long as it is known, can be known, and acknowledged,” says Rozal.
Publication bias: After the data analysis and interpretation have concluded, it is time to share the study’s results. There is a bias to discard information, even in scientific studies, that does not support the authors’ or authorities’ intended message. This is a form of both confirmation bias and selection bias.
“Simply put, poor information leads to actions that produce undesired outcomes,” says Rozal. “Checks during the analysis process by someone not involved in the data collection and analysis [are important.] At Saatva, we instituted a rule in the analytics department: Every analyst has a checker who is a different person.”
Discrimination Is Considered Bad, But Sometimes It’s Necessary
Fairness requires the knowledge and management of bias. For example, Bob Rogers, a Harvard-trained data scientist and CEO of supply chain optimization company Oii.ai, was developing healthcare diagnostic AI while at the University of California at San Francisco (UCSF) that identified emergent conditions in the chest x-rays of intensive care unit (ICU) patients. When tested on data from other healthcare organizations, variations in patients’ care histories and the ways x-rays were taken caused a significant degradation in the performance of the algorithm. Ultimately, with additional training on a more representative data set, a reliable, highly performant algorithm resulted.
“Bottom line, under-representation of some patient situations led to blind spots in the AI, which would have prevented dependable performance and FDA clearance of the system,” says Rogers in an email interview. “This is just in the healthcare sector. You can easily expand the scenario to other industries. Bias is directly or indirectly trained into AI when there’s lack of diversity in data.”
To minimize such bias, it’s important to use diverse, quality datasets.
“The biggest challenge is that you don’t always know what your biases are, or where you are under-representing a particular group, so slice and dice the data in as many ways as [possible], and watch for nearly empty buckets of examples,” says Rogers. “There are tools to randomly take out different subsets of the data to see what impact that has on the training results which can help identify weaknesses. Finally, you will always need to hold out some data for final algorithm validation. There is always a temptation to hold out a small set so that you have a lot of data to train with, but you should resist that temptation and hold out as much as you can. This last data set is your x-ray vision into the potential weaknesses of your model.”
Bob Rogers, Oii.ai
There are some instances in which bias is necessary, such as deciding which customers will receive a mortgage loan or credit card, but financial institutions must be mindful about unlawful discrimination. Simply excluding features such as sex or race isn’t sufficient because sometimes the same information can be inferred through other features, such as a personal name.
“If data is being productized for external customer-facing applications, then wider issues need to be taken into account, such as sample bias, which may favor certain characteristics and customers or users,” says Adam Lieberman, chief AI officer at financial services technology provider Finastra. “In the case of a hypothetical loans decision-making algorithm, training data that reflects middle-class white men will underserve and discriminate against those who do not fall into that societal cross-section. Of course, this example brings in other types of biases, such as demographic bias, cultural bias, confirmation bias and algorithmic bias. The latter is informed by the former examples, but the results can be particularly damaging in the context of automated decisions, which will inevitably lead to unfair outcomes.”
To minimize potential data biases, he recommends:
Ensuring data science teams understand the potential problems unfair outcomes might have on the users and the communities their solutions are serving,
Ensuring employees have a good awareness of regulations that protect users from bias and discrimination, such as the UK’s Equality Act and the EU’s GDPR,
Establishing data frameworks and best practices, which are supported by technical and procedural guardrails is an essential component of this activity, and
Making sure data is high-quality, diverse and reflective of the communities a solution will serve.
“Something that inevitably helps with this endeavor is ensuring data science teams are representative of the communities they are building solutions for. If the data is productized through an algorithm, outcomes should be explainable (as much as is possible) with the methodologies for data selection, collation, cleaning, and feature generation rigorously assessed and justified,” says Lieberman. “Testing is also a crucial stage of establishing the viability of AI-powered solutions. If outcomes demonstrate bias, then retraining and better or more diverse datasets will be required.”
Perhaps surprisingly to laypeople, data scientists consider the lack of data a major challenge, despite rapidly expanding universe of it. However, there may not be enough diverse or historical data to avoid biased outcomes.
“A solution to this problem is the use of conditional generative adversarial networks that are able to generate synthetic data that merges selective properties of the original data with more diverse and representative attributes,” says Lieberman. “Other mitigation strategies mostly align with best practices around data gathering. If a dataset is found to be lacking, collecting data on underrepresented groups will be required. Data labelling is also a crucial element and is a process that must align with data frameworks and organizational guidelines. For example, if subjective bias infects the labelling process, this will be reflected in the outcomes produced.”
His advice is to ensure organizational data management frameworks and guidelines that address the various problems around bias are in place.
“By creating a culture that is conscious of the harm bias may cause customers and the organization, the potential for the various types of bias to creep in will be greatly reduced,” says Lieberman. “Establish internal advocacy groups and knowledge-sharing forums so that everyone is aware of the risks, their role in reducing bias and the potential repercussions of not doing so.”
About the Author
You May Also Like