What would it be like to browse the web if it only contained women’s voices?

6 min readJul 5, 2021

An experimental plug-in imagines a web without men. It can be, perhaps unsurprisingly, empty.

Mark Wilson wrote in the article, that covered this experiment: “We’ve all heard about the gender gap, the fact that men make more money working the same jobs that women do, and that men have more of a voice in politics and in the media. Indeed, in my profession, 60% of bylines belong to men — and that figure jumps to nearly 70% if you look specifically at breaking news. But this phenomenon is mostly invisible.”

It’s hard to look at any publication and instantly understand how much of the content was written by women or men unless you’re in the mood to count bylines. When working for the U.K. data and design studio Normally, we developed the Gendered Web plug-in, a filter for news sites that makes every story written by men disappear, so that only the voices of women remain.

The experiment made it apparent that the contributions written by women make up a much smaller part of the overall writing in the newspaper. Sections now show vast blank areas, once occupied by male authors, peppered with the occasional female contribution.

This is not a quantitative measure — instead, it’s a way to ***see*** gender diversity on a given page at a glance.

We initially tested this idea with the New York Times (NYT), because the publication makes it relatively easy to identify the gender of their authors as their names are tagged directly into the code of the website.

The first time we used the tool and visited the Opinion section, this was what we saw: A completely blank page, with only one article written by a woman entitled “The Strong and Stressed Black Woman.” At the time of putting this text together, on the 25th of November 2019, the only female voice on the Opinion section was one article entitled “I Give Thanks for the Matriarchs.” Is there a pattern that women are mostly commissioned to write about women?

Subsequently, we expanded the tool to work on other publications where author attribution is more hidden.

A gif of the extension in use while browsing the New York Times. [Image: Normally]

Although the idea was relatively simple, the execution was loaded with social and technical complexity. In building this plug-in, we were faced with a number of unexpected decisions. Each of these decisions carried more influence and meaning than we expected. We outline the challenges and our responses below.

How we try to determine gender Filter

The plug-in analyses each article extracts the byline (“By …”) and determines if the author identifies as a woman.

We tried a number of approaches to determine the author’s gender. First, we used a simple list of first names compiled from US data, each marked with a gender. The performance was laughably poor. Many names are used interchangeably with genders and the list was composed of almost exclusively traditional US names.

We later found a much more useful list that did include multicultural names — but there were still outliers. There are people with one-of-a-kind names, people who use only their first initial, and of course, names (“Alex”) that are used across genders.

It became clear to us that a ‘pretty-good-but-not-perfect success rate’, would not be good enough and we would have to find a better gender identification method. Failing to correctly identify a writer as a woman would delete that woman’s voice from the page — the opposite of what we’re trying to do.

So for this tool, it was important not to get it wrong, even a small amount of the time. For this reason, we ended up manually creating a list of women authors. Every time the tool was run, it would output the authors it was going to hide.

This list was manually checked to confirm if the author identified as she/her by researching them on the wider web — one by one. Once we had a clear positive they were added to a “don’t hide” list.

This approach felt like a reasonable compromise to the problem. However, every time we ran the plug-in, we encountered a whole new list of previously unseen authors. In order for this tool to stay accurate, the list would need to be constantly updated. We had inadvertently created our own mini content-farm — recruiting the Normally team at lunchtimes, extracting new unseen authors, and adding them to a giant list! This is the real work of most AI systems — and part of the reason that we’re not publishing the plugin. It would be out of date in days.

What to do about multiple authors?

Clifford Krauss, David Yaffe-Bellany, and Mariana Simões

There are often multiple authors for an article. We decided that if at least one author was a woman, we would show it. We are aware that this was our decision to make — and that there was not a ‘wrong’ or ‘right’ one. If the article was attributed to an unnamed group (“The Editors”) then we would hide it.

On maintaining a list of women…

Is it weird having a list of women as a file in your codebase? The answer to this question is yes. Although all the data we used was public (i.e. author’s name, their bios, etc.), the difference is that we aggregated it. Any data in aggregate have greater potential impact, including greater potential for misuse. For this reason (and for the maintenance issue mentioned above) we have chosen not to publish the plugin.

How

Checking the coverage — Across a news site, the section pages (World, Politics, Opinion, etc.) often have different underlying page structures with which they display articles. This means that we needed a way to check that our code was covering all the different objects on a given page, and was not accidentally ignoring any we hadn’t seen before. To help with this, we built a special view that would highlight the parts of the page that it had tagged (green = woman), which gave us an at-a-glance view of any missing areas.

Re-running on page change — Many of the index pages change dynamically and add content on scroll and click, so we triggered the plugin to re-run on each page change.

Requesting each article — Our first version of this plugin simply looked for the byline in the source of each index page. But many articles didn’t have the name in the source, only on the article page itself. So we updated the plugin to request the content of each page behind the scenes, and check the metadata for the byline. Since there might be 40–100 articles on a page, this meant that it would take some time to request and parse each article — but this inadvertently also created a mildly dramatic experience — content is slowly deleted from the browser as you watch!

Naming the variables — Naming really matters, especially in code. It can make code more or less readable, but can also, in a project like this, construct your mental model of what you’re trying to achieve. These are the underlying primitives you’re using to construct your project and so they will affect your thinking. So should our list be a list of men which we hide? (This would have the effect of retaining non-binary identifying authors which might be desirable). Or should it be a list of women, and we show only those? Should the function be called “isNotMan” or “isWoman”? These subtleties may not matter as much in some projects, but for this one, we felt that the code needed to reflect our true intent. In this case — that was to highlight women’s voices.

So the code says:

let authors = byline.map(item => {
return isInList(item, womenWriters);
});

In summary

Of course, gender equality is a critical issue and it is important that designers use their skills to identify and expose it wherever and whenever possible.

However, in our reading of this experiment, we saw the ability to filter content on its bias is relevant beyond gender.

“In other words, the levers determining what we see are already there, and someone is pulling them whether we like it or not. The solution isn’t to filter out men or to filter out certain viewpoints, but to identify those levers of the manipulation machine and, when they’re in error, break them appropriately.” — Mark Wilson

What would it be like to browse the web if it only contained women’s voices?

How we try to determine gender Filter

What to do about multiple authors?

On maintaining a list of women…

How

In summary

Written by Alexandra Plesner

No responses yet