All You Need to know About ‘AI Bias’ in AI Lifecycle

AI bias stands for irregularities in the output of machine learning algorithms. These irregularities could be due to the prejudiced assumptions which are made during the development process of an algorithm. Biases in Artificial Intelligence could also exist because of the prejudices in the training data. The problem of bias in Artificial intelligence does have a historical precedence. For example, Back in 1988, the United Kingdom Commission for Racial Equality investigated a British medical school and found them guilty of discrimination. The computer program that the institution was using to analyze the interview of the candidates was determined to be biased against women and those with non-European names. Discrimination predominantly exists because of data in AI and Machine learning programs. This “algorithmic bias” arises when AI and computing systems do not act in complete objective equality instead they act in accordance with the prejudices and stereotypes that exist within the human who formulated, cleaned and structured their data. In addition to that, there is technical bias which arises from technical limitations, whether they are known or not. Technical bias includes the tools and algorithms which are frequently used by an AI system. Another form of bias is the emergent bias which occurs only in the context of using the system, emergent bias is involved whenever some information is introduced or when there’s a user and system design mismatch.


There are two reasons due to which biases arise in AI systems:

Cognitive biases: The effective feelings towards a person or a group based on their perceptions is known as cognitive biases. Psychologists have concluded that there are more than 180 human biases. These biases have been defined and classified and each can affect individuals. They could be introduced into machine learning algorithms via either by designers who unknowingly introduce them into the model or a training data set which include those biases.

Lack of complete data:  Biases can occur if the data is not complete, since data cannot be representative if it is not complete and therefore it may include bias. For example, most psychology research studies include results from undergraduate students which are a specific group and do not represent the whole population


● There are numerous examples of human bias we observe which are happening in tech platforms. Since data on tech platforms is later used to train machine learning models, these biases, in the longer run, produce biased machine learning models.

● Up until 2019, Facebook was allowing its advertisers to target adverts according to gender, race, and religion on purpose. For example, women were prioritized in job advertisements for roles in nursing or secretarial work, on the other hand job ads for janitors and taxi drivers had been mostly shown to men, particularly men from minority communities. Later on, Facebook decided to not allow employers to specify age, gender or race which can lead to targeting of this sort in its ads anymore.

● AI Bias was found in a risk assessment software known as COMPAS. Jurists used it to potentially predict which criminals were most likely to offend. When news organization ProPublica compared COMPAS risk assessments for 10,000 people arrested in one county in Florida with data showing which ones went on to reoffend, it was discovered that with a  right algorithm the decision making was fair. But when the algorithm was wrong, people of color were almost twice as likely to be labeled a higher risk, yet they did not re-offend.

● The May 2016 accident involving a Tesla Model S and a tractor trailer in Williston, Florida is one such instance of technical bias. In the accident the Tesla driver succumbed to the injuries, the driver had autopilot engaged when a tractor trailer drove across a divided highway perpendicular to the car. Tesla later shared in an article “Neither Autopilot nor the driver noticed the white side of the tractor trailer against a brightly lit sky, so the brake was not applied.”

● In 2016, Microsoft launched an AI-based conversational chatbot on Twitter that was designed to interact with people through tweets and direct messages. However, it started replying with highly offensive and racist messages shortly after its release.

● The chatbot was trained on anonymous public data and had an in-built internal learning feature, which led to a targeted attack by a group of people to introduce racist bias in the system. Some users were able to inundate the bot with misogynistic, racist and anti-semitic language.

● On June 30, 2020, the Association for Computing Machinery in New York City ordered the cessation of private and government use of facial recognition technologies due to what they described as a "clear bias based on ethnic, racial, gender and other human characteristics."

● The ACM said that the bias had an exclusive effect particularly on the lives, livelihoods and fundamental rights of individuals in specific demographic groups. Due to the pervasion pushed on due to the nature of AI, it is crucial to address the algorithmic bias issues to make the systems more fair and inclusive.


●     We must hold accountability while assessing the numerous ways through which AI can improve on conventional human decision-making. Machine learning systems disregard or in other ways discredit variables that do not accurately predict consequences in the data available to them. This is in stark contrast to humans, who could potentially lie about or not even realize the factors that led them to have a bias while, for instance, hiring or disregarding  a particular job candidate.

●     Numerous resources are available to assist developers with their data production lines these days, but not all tools are built on the same line. Some unconventional  crowdsourcing models come with inherent risks because they remain a mystery for everyone. With no relationship established with the workers who process the data, there is absolutely no way to amend subtle problems that might emerge. As a result of this, bias in  important datasets is inevitable to occur. This renders those tools inadequate for any business looking to offload enterprise-grade work.

●      In order to mitigate the unintended bias developers have to develop systems very attentively. This is independent of the way organizations decide to manage their data. Accountability is a relationship-driven business model, so developers must identify ways to strategically deploy people in the data annotation process.

●     Communication and ability to evolve processes important for developers to ensure that their AI systems consume training data that reflects accurately on the ground. Developers should also have the ability to initiate roadblocks and make necessary amendments  so that potential bias in the data could be eliminated.

●     Investing more in diversifying the AI field itself can also be an option to check bias in the AI system. A more diverse AI community would be better equipped to anticipate, review, and spot bias and engage communities affected. This will require investments in education and opportunities.

●     We need to consider how humans and machines can work together so that biases could be mitigated.  Some “human-in-the-loop” systems can potentially make recommendations or provide options so that humans could double-check or can choose from a range of options. Transparency about these algorithms’ confidence in its recommendation can potentially  help humans to  understand. The gravity of the problem that these biases could have on AI systems.

Overall, diversity and proper representation of the marginalised communities can solve the problem of bias in the AI lifecycle. Many of such issues were covered in Discriminating Systems, a major report from the AI now Institute in 2019, which concluded that diversity and AI bias issues should not be considered separately because “they are two sides of the same problem”.





Get great AI related content from our team to your inbox.