Google image searches of certain professions, such as “CEO”, have been found to solely represent men – Google claimed to have fixed this – but the results can change searchers’ worldviews
While Google Images are meant to be representative of all people, image searches are also meant to be reflective of this.
For example, if you were to search about a certain profession, such as “truck driver”, Google should be able to provide images that show us a representative portion of people who drive trucks for a living.
However, researchers at the University of Washington found that when searching for a variety of occupations, including “CEO”, women were significantly underrepresented in the image results, and that these results can change searchers’ worldviews.
Looking at how the image search gender bias ratio changed depending on how many images they analysed, the researchers note that this is an important component to fix on search engines, as it can often add unknown bias to a user’s perceptions of the world – and perceptions of women.
Image searching for 10 common occupations yielded similar results
While Google claims to have solved this issue – other research have demonstrated that for four major search engines around the world, including Google, this bias is only partially fixed.
The team investigated image search results for Google as well as for China’s search engine Baidu, South Korea’s Naver and Russia’s Yandex. They conducted an image search for 10 common occupations, such as CEO, biologist, computer programmer and nurse, both with and without an additional search term, such as “United States.”
A search for an occupation, such as “CEO,” yielded results with a ratio of cis-male and cis-female presenting people that matches the current statistics. However, when adding another search term, like, “CEO + United States”, the image search withheld even fewer photos of cis-female presenting people.
Senior author Chirag Shah, a UW associate professor in the Information School, said: “My lab has been working on the issue of bias in search results for a while, and we wondered if this CEO image search bias had only been fixed on the surface.
“We wanted to be able to show that this is a problem that can be systematically fixed for all search terms, instead of something that has to be fixed with this kind of ‘whack-a-mole’ approach, one problem at a time.”
Lead author Yunhe Feng said: “This is a common approach to studying machine learning systems. Similar to how people do crash tests on cars to make sure they are safe, privacy and security researchers try to challenge computer systems to see how well they hold up.
“Here, we just changed the search term slightly. We didn’t expect to see such different outputs.”
Systematically adding to gender bias
Collecting the top 200 images, the team used a combination of volunteers and gender detection AI software to identify each face as cis-male or cis-female presenting – however issues arose with the gender binary – they were still allowed to compare their findings to data from the U.S. Bureau of Labor Statistics for each occupation.
Adding “+ United States” to the Google image searches, researchers found that a few occupations had larger image search gender bias ratios than others.
While looking at more images sometimes resolved these biases, the image search gender bias and disparities amongst the browsers provided a much bigger picture – the overall the trend remained – the addition of another search term changed the gender ratio.
Feng added: “We know that people spend most of their time on the first page of the search results because they want to find an answer very quickly. But maybe if people did scroll past the first page of search results, they would start to see more diversity in the images.”
The team designed three algorithms to systematically address the issue
Algorithms can systematically reduce bias across a variety of occupations – but the real goal will be to see reductions of gender bias in searches on Google, Baidu, Naver and Yandex.
The first algorithm they designed randomly shuffles the results, where they tested their algorithms on the image datasets collected from the Google, Baidu, Naver and Yandex searches.
For occupations with a large bias ratio, like “biologist + United States” or “CEO + United States”, all three algorithms were successful in reducing gender bias in the search results.
Conversely, for occupations with a smaller bias ratio, like “truck driver + United States”, unfortunately only the algorithm with knowledge of the actual statistics was able to reduce the bias – this makes it slightly more difficult to manage.
However, the other two algorithms add more strategy to the image-shuffling: where one includes the image’s “relevance score” – which search engines assign based on how relevant a result is to the search query.
The other requires the search engine to know the statistics bureau data and then the algorithm shuffles the search results so that the top-ranked images follow the real trend.
Shah added: “This is not just a Google problem. I don’t want to make it sound like we are playing some kind of favouritism toward other search engines.
“Baidu, Naver and Yandex are all from different countries with different cultures. This problem seems to be rampant. This is a problem for all of them. This one tries to shake things up to keep it from being so homogeneous at the top.”
Feng added: “We can explain why and how our algorithms work. But the AI model behind the search engines is a black box. It may not be the goal of these search engines to present information fairly. They may be more interested in getting their users to engage with the search results.”