‘Garbage in, Garbage out’ – Why gender biased algorithms lead to amplified stereotypes

Women are hysterical. Real men don’t cry. Women belong in the kitchen and men in the office. Girls are bad at math and Artificial Intelligence is for boys. To most, these may sound familiar. Gender stereotypes are all around us and often deeply embedded in society. They form us during our most sensitive years and often present themselves later as (sub)conscious biases. To find out why it is important to rethink stereotypical gender roles, continue reading. In a world where data is queen – or was it king (?) – algorithms amplify harmful stereotypes.

Explaining the gender data gap

In her best selling book, Caroline Criado Perez explains why we need to close the gender data gap.

To this day, collected data often contains biases towards a specific gender. In biomedical sciences, the so-called gender data gap has been studied increasingly. The cause of this data gap takes many different forms, like providing less money to female related studies than male related studies or excluding women while sampling. Many biomedical research fields test mainly on male animals, in particular neuroscience. Moreover, although the National Institutes of Health (NIH) now requires researchers to include women in clinical trials (thanks to Bernadine Healy, first female director of the NIH), participating women are often obligated to use contraceptives. This was said to minimise the risks for fetuses, however, this appeared to be related to lower costs and time spent on research. The use of these contraceptives restricts research in obtaining information about people with their natural menstrual cycle. Additionally, in cases when women were included in research, the data is sometimes not sex-disaggregated when analysed. This can cause the results to be fitting for only one specific sex. Due to the lack of gender-inclusive studies, non-gender-disaggregated data, but also stigmas, many people are often misdiagnosed. Misdiagnosing is a common problem in, for example, psychiatry. Women who have autism and/or ADHD are more often underdiagnosed than men with these disorders. This issue also concerns men. Depression and anxiety often present differently in men than in women, which causes men suffering from these disorders to be more often underdiagnosed than women.

These biases in testing and diagnosing can all lead to biases in participant recruitment for research and analysation of the results, which subsequently leads to biases in data collection. One of the aims in medical AI is to be able to use AI to diagnose patients. But if AI is trained on the biased data, it won’t be able to do so properly.

‘Garbage in, Garbage out’

George Fuechsel

Four algorithms that can predict liver disease with a 70% accuracy were studied. They examined the accuracy by sex and found that the diagnose was missed about two times more often in women compared to men. Dr Isabel Straw warns that we need to stay alert for AI increasing certain inequalities. 

‘When we hear of an algorithm that is more than 90% accurate at identifying disease, we need to ask: accurate for who?’

– Dr Isabel Straw

This problem is not only seen in biomedical sciences, but in all types of data collections. For example, voice recognition software is 70% less accurate in understanding female voices than male voices. This is caused by the datasets it is trained on, since 70% of the voices used are male.

How Gender Bias Can Creep Their Way Into Algorithms Design

Algorithm design can also influence gender bias. Sometimes, information is commonly known, but still overlooked. A simple example of software where women were forgotten, is Apple’s HealthKit. This app was created so users can track their own health including many niche options. However, it initially did not include a tracking system for menstrual cycles. Only after a shitstorm on social media, was it included. In many companies, the majority of developing and higher positions are occupied by men. While it is not impossible for men to consider issues primarily affecting women, they are still less likely to do so unprompted, therefore it is important to have a diverse team that will look at problems from different angles.

How Algorithms Amplify Our Biases

These are only a few of the many examples that show that algorithms are only as good as the people that design them or, as Caroline Criado Perez puts it, ‘only as good as the data we feed them’. But we have bad news for you: It gets worse. Algorithms not only reflect our biases but also reinforce them.

‘Algorithms amplify our biases back to us.’

Caroline Criado Perez

We associate the word ‘female’ with ‘assistant’ or ‘housework’. Think of Siri and Alexa. It is by no means a coincidence that digital voice assistants often have female names and female voices. After all, their role is one that has traditionally been assigned to women. They are supposed to help us set an alarm, make a shopping list, or schedule an appointment. But when we consistently design these ‘female’ assistants, we will continue to associate assistant-jobs with women and consider women as subservient, as a 2019 UNESCO study shows. People even think their Roomba is female, since it performs housework.

Similarly, machine learning algorithms have learnt to make the same – but stronger – word associations. These are called word embeddings and are increasingly used in many applications, such as search engines, chatbots, as well as resume screening. Their input is huge amounts of text in the form of articles, encyclopedias, or books, and their task is to identify relationships between words by counting how often they occur together and identifying how close they occur to each other. To test how well the system does its job, one can ask it to complete analogies, such as ‘He is to King as She is to X’. If it returns ‘Queen’ as any human would, it is deemed successful. These publications show that using these types of word associations makes search results statistically more relevant, but also extremely biased. It found that the term ‘computer programmer’ is not only related to the term ‘javascript’, but also has a much stronger association with male names than with female names. After training its system on Google News articles, this research team asked their system to complete the analogy: ‘Man is to Computer Programmer as Women is to X’. Its answer: ‘Homemaker’. One might say these results are not entirely surprising being merely consistent with our own worldview and the fact that there are still significantly more men than women in programming and more housewives than househusbands. But they are very problematic, since the algorithm not only reproduces but also amplifies this bias. The few existing female programmers end up even lower in the search results.

‘Man is to Computer Programmer as Women is to Homemaker.’

The same problem is particularly persistent in translation software. Google Translate is happily turning doctors into men and nurses into women. Try for yourself. The English terms ‘the doctor’ and ‘the nurse’ become German ‘der Arzt/Doktor’ (i.e. the male doctor) and ‘die Krankenschwester’ (i.e. the female nurse). But there is some improvement. In Spanish, the same terms become ‘el doctor’ and ‘la doctora’ and ‘el enfermero’ and ‘la enfermera’. These examples show that it is possible to remove stereotypical biases from algorithms.

How Stereotypes Influence Our Behaviour

A 2021 study on user perception of gender bias in information retrieval systems shows how heavily influenced we are by stereotypes. When users searched for ‘how to easily clean at home’, they perceived a female-biased result, such as  ‘Cleaning How-Tos and Home Making Tips | Housewife How-Tos’, as highly relevant (i.e. consistent with their world view). However, when the result was male-biased (‘Housewife’ replaced by ‘Houseman’), users thought it was highly inaccurate. While it is true that more than 75 % of all unpaid care and domestic work is done by women, we should stop and ask ourselves why this is the case and whether this is how we would like the world to be.

When we continue to portray computer programmers as men and homemakers as women we can’t expect change. This documentary shows that already at age 7, kids are so influenced by stereotypical gender roles, that they are completely unaware of their options. One extremely shocking example is that girls don’t even know they can become pilots, too. How can that be? It is simple: When we keep being told that something doesn’t suit us, we start believing it. This is a phenomenon called ‘Stereotype Threat‘, a term that was first coined by Claude Steele and Joshua Aronson in 1995. Together with other colleagues they studied the stereotypes ‘Girls are bad at math’ and ‘Asian people are good at math’. In both cases these threats turned into a self-fulfilling prophecy. Girls scored considerably worse on the math test than boys when told that boys score much better than them but equally well otherwise. And white men scored significantly lower when they were knowingly compared to Asian men.

Erin Heaning points out that on an individual level stereotype threat may lead to self-doubt, self-confidence, and even self-sabotage. Exposed individuals may experience increased anxiety, belonging uncertainty, or imposter syndrome. This paper explains that feeling stereotype-threatened causes girls to loose interest in STEM fields. On a societal level, stereotype-threat can cause stigmatisation, discrimination, gender gaps in achievement, and underrepresentation. It takes little imagination to see that these factors, biased designs, and a persisting gender data gap are mutually reinforcing. Therefore, it is about time that we fix the systematic underrepresentation of women in image search results for occupations, correct ranking biases in search engines, and work on our relevance bias.

Here is how.

How We Can Contribute To Making Things Better

First, we need to become aware of our own biases and motivate others to do the same. You can check yours with the Implicit Association Test. Second, become aware of the risks that these biases pose in a world where data is worth more than oil. By reading this article, you are on the right path. Third, we need to provide so-called counter-stereotypical role models and mentors. Make female doctors and data scientists, but also male nurses and homemakers more visible. Specifically encourage more women to pursue careers in STEM to increase diversity in teams. A predominantly male team is likely to (unintentionally) forget about women’s needs or women’s representation. Click here for more details on how you can remove barriers that keep women from considering a career in data science.

When data scientists do not bring diverse perspectives to their work, the science and technology they produce suffers.

– Carole Stabile and Maya Rios

Moreover, we need to close the gender data gap by collecting more data on women and start to sex-disaggregate data, meaning we need to collect and analyse it separately. Learn more about it by reading this book. That way our algorithms can become more gender-inclusive. In the meantime, we can follow the example of these researchers that managed to debias their machine learning algorithm. Or make sure that we extensively test them in real-life settings to prevent embarassing disasters such as Apple’s HealthKit.

And remember:

Data is no longer simply a reflection of society, but fundamental to its formation.’

Caroline Criado Perez

Leave a Reply

Your email address will not be published. Required fields are marked *

Human & Machine

Digital Sugar: Consequences of unethical recommender systems

Introduction We are spending more and more time online. The average internet user spends over 2 hours on social networking platforms daily. These platforms are powered by recommendation systems, complex algorithms that use machine learning to determine what content should be shown to the user based on their personal data and usage history. In the […]

Read More
Human & Machine

Robots Among Us: The Future of Human-Robot Relationships

The fast-paced evolution of social robots is leading to discussion on various aspects of our lives. In this article, we want to highlight the question: What effects might human-robot relationships have on our psychological well-being, and what are the risks and benefits involved? Humans form all sorts of relationships – with each other, animals, and […]

Read More
Human & Machine Labour & Ownership

Don’t panic: AGI may steal your coffee mug, but it’ll also make sure you have time for that coffee break.

AGI in the Future Workplace In envisioning the future of work in the era of Artificial General Intelligence (AGI), there exists apprehension among individuals regarding the potential displacement of their employment roles by AGI or AI in general. AGI is an artificial general intelligence that can be used in different fields, as it is defined […]

Read More