Which of the following was an unintended effect of the use of the twitter cropping algorithm?

With social media and online services are now huge parts of daily life to the point that our entire world is being shaped by algorithms. Arcane in their workings, they are responsible for the content we see and the adverts we’re shown. Just as importantly, they decide what is hidden from view as well.

Important: Much of this post discusses the performance of a live website algorithm. Some of the links in this post may not perform as reported if viewed at a later date.

The initial Zoom problem that brought Twitter’s issues to light.

Recently, [Colin Madland] posted some screenshots of a Zoom meeting to Twitter, pointing out how Zoom’s background detection algorithm had improperly erased the head of a colleague with darker skin. In doing so, [Colin] noticed a strange effect — although the screenshot he submitted shows both of their faces, Twitter would always crop the image to show just his light-skinned face, no matter the image orientation. The Twitter community raced to explore the problem, and the fallout was swift.

Intentions != Results

An example pair of source images posted to Twitter, featuring two faces in alternate orientations.

Twitter users began to iterate on the problem, testing over and over again with different images. Stock photo models were used, as well as newsreaders, and images of Lenny and Carl from the Simpsons, In the majority of cases, Twitter’s algorithm cropped images to focus on the lighter-skinned face in a photo. In perhaps the most ridiculous example, the algorithm cropped to a black comedian pretending to be white over a normal image of the same comedian.

The result – Twitter’s algorithm crops on the white face, regardless of orientation.

Many experiments were undertaken, controlling for factors such as differing backgrounds, clothing, or image sharpness. Regardless, the effect persisted, leading Twitter to speak officially on the issue. A spokesperson for the company stated “Our team did test for bias before shipping the model and did not find evidence of racial or gender bias in our testing. But it’s clear from these examples that we’ve got more analysis to do. We’ll continue to share what we learn, what actions we take, and will open source our analysis so others can review and replicate.”

There’s little evidence to suggest that such a bias was purposely coded into the cropping algorithm; certainly, Twitter doesn’t publically mention any such intent in their blog post on the technology back in 2018. Regardless of this fact, the problem does exist, with negative consequences for those impacted. While a simple image crop may not sound like much, it has the effect of reducing the visibility of affected people and excluding them from online spaces. The problem has been highlighted before, too. In this set of images of a group of AI commentators from January of 2019, the Twitter image crop focused on men’s faces, and women’s chests. The dual standard is particularly damaging in professional contexts, where women and people of color may find themselves seemingly objectified, or cut out entirely, thanks to the machinations of a mysterious algorithm.

The problem remained consistent in many community tests, involving newsreaders, cartoons, and even golden and black labradors.

Former employees, like [Ferenc Huszár], have also spoken on the issue — particularly about the testing process the product went through prior to launch. It suggests that testing was done to explore this issue, with regards to bias on race and gender. Similarly, [Zehan Wang], currently an engineering lead for Twitter, has stated that these issues were investigated as far back as 2017 without any major bias found.

It’s a difficult problem to parse, as the algorithm is, for all intents and purposes, a black box. Twitter users are obviously unable to observe the source code that governs the algorithm’s behaviour, and thus testing on the live site is the only viable way for anyone outside of the company to research the issue. Much of this has been done ad-hoc, with selection bias likely playing a role. Those looking for a problem will be sure to find one, and more likely to ignore evidence that counters this assumption.

Efforts are being made to investigate the issue more scientifically, using many studio-shot sample images to attempt to find a bias. However, even these efforts have come under criticism – namely, that using an source image set designed for machine learning and shot in perfect studio lighting against a white background is not realistically representative of real images that users post to Twitter.

Some users attempted to put the cause down to issues of contrast, saturation, or similar reasons. Whether or not this is a potential cause is inconclusive. Regardless, if your algorithm will only recognise people of color if they’re digitally retouched, you have a problem.

Twitter’s algorithm isn’t the first technology to be accused of racial bias; from soap dispensers to healthcare, these problems have been seen before. Fundamentally though, if Twitter is to solve the problem to anyone’s satisfaction, more work is needed. A much wider battery of tests, featuring a broad sampling of real-world images, needs to be undertaken, and the methodology and results shared with the public. Anything less than this, and it’s unlikely that Twitter will be able to convince the wider userbase that its software isn’t failing minorities. Given that there are gains to be made in understanding machine learning systems, we expect research will continue at a rapid pace to solve the issue.

Twitter’s image cropping algorithm prefers younger, slimmer faces with lighter skin, an investigation into algorithmic bias at the company has found.

The finding, while embarrassing for the company, which had previously apologised to users after reports of bias, marks the successful conclusion of Twitter’s first ever “algorithmic bug bounty”.

The company has paid $3,500 to Bogdan Kulynych, a graduate student at Switzerland’s EFPL university, who demonstrated the bias in the algorithm, which is used to focus image previews on the most interesting parts of pictures, as part of a competition at the DEF CON security conference in Las Vegas.

Kulynych proved the bias by first artificially generating faces with varying features, and then running them through Twitter’s cropping algorithm to see which the software focused on.

Since the faces were themselves artificial, it was possible to generate faces that were almost identical, but at different points on spectrums of skin tone, width, gender presentation or age – and so demonstrate that the algorithm focused on younger, slimmer and lighter faces over those that were older, wider or darker.

“When we think about biases in our models, it’s not just about the academic or the experimental … but how that also works with the way we think in society,” said Rumman Chowdhury, the head of Twitter’s AI ethics team told the conference.

“I use the phrase ‘life imitating art imitating life’. We create these filters because we think that’s what ‘beautiful’ is, and that ends up training our models and driving these unrealistic notions of what it means to be attractive.”

Twitter had come under fire in 2020 for its image cropping algorithm, after users noticed that it seemed to regularly focus on white faces over those of black people – and even on white dogs over black ones. The company initially apologised, saying: “Our team did test for bias before shipping the model and did not find evidence of racial or gender bias in our testing. But it’s clear from these examples that we’ve got more analysis to do. We’ll continue to share what we learn, what actions we take, and will open source our analysis so others can review and replicate.” In a later study, however, Twitter’s own researchers found only a very mild bias in favour of white faces, and of women’s faces.

The dispute prompted the company to launch the algorithmic harms bug bounty, which saw it promise thousands of dollars in prizes for researchers who could demonstrate harmful outcomes of the company’s image cropping algorithm.

Kulynych, the winner of the prize, said he had mixed feelings about the competition. “Algorithmic harms are not only ‘bugs’. Crucially, a lot of harmful tech is harmful not because of accidents, unintended mistakes, but rather by design. This comes from maximisation of engagement and, in general, profit externalising the costs to others. As an example, amplifying gentrification, driving down wages, spreading clickbait and misinformation are not necessarily due to ‘biased’ algorithms.”