Skip to content

Women of category: analysis of the gender bias in the organisation of knowledge systems

The Xarxa Vives d’Universitats, which represents and coordinates the joint action of 22 universities in the Catalan-speaking territory proposed to the scientific team of the Women and Wikipedia project (PID2020116936RA-I00) to carry out a technical diagnosis report, academic positioning and proposal for the improvement of the Wikipedia knowledge organisation system. Translated with DeepL.com (free version)

The need arose through the Wikipedia user group, Viquidones, addressed to the Equality Unit of the Pompeu Fabra University, which was concerned about the gender bias in the contents of the digitally co-created encyclopaedia and its knowledge organisation system, which prohibits the use of the categories “women” and “non-binary people” to classify contents. This is only the case in Catalan and Italian Wikipedia. In the French Wikipedia and in other languages that differentiate gender grammatically, the creation of categories for women is allowed. Thus, in the Catalan version, there is the phenomenon, as Viquidones has always pointed out, of feminised professions with masculine categories, such as “Infermers” (Nurses), “Nadadors de sincronitzada” (Synchronised swimmers) or “Morts per càncer d’úter” (Deaths due to uterine cancer).

Faced with this situation, the Women and Wikipedia project, funded by the Ministry of Science and Innovation and the Spanish Research Agency, with the main researcher Núria Ferran, professor at FIMA, and the direct collaboration of other members of the team, especially Professor Miquel Centelles, produced a report that has just been published jointly by the Xarxa Vives d’Universitats and the Centre for Research in Information, Communication and Culture of the University of Barcelona.

For the preparation of this report, a review of the existing literature was carried out and, on the one hand, Wikipedia was analysed, where there has been a case of non-acceptance of categories that identify different genders. The arguments used during the three deliberations and votes within the community were analysed in order to understand the need to not allow the visibility of women through a resource for browsing and retrieving information such as categories. On the other hand, as a solution, it is proposed to use another Wikimedia project, Wikidata, which is a formal representation of the knowledge graph implicit in the content of Wikipedia pages. In order to analyse both Wikipedia and Wikidata, a diagnosis of these two knowledge organisation systems has been carried out through techniques such as user testing, heuristic evaluations, inspection of standards and interviews with experts. It also studies the solution that the Wikipedia community incorporated by rejecting the use of categories by gender through a search engine implemented in some Wikipedia pages.

After both analyses of both Wikipedia and Wikidata, conclusions are drawn from an academic point of view about the presence or absence of gender bias in Wikipedia and Wikidata. The solution proposed in this report suggests that Wikidata be the knowledge organisation system of Wikipedia, as Wikidata does not increase the gender bias present in society.

At the same time, Wikidata and its property related to gender identities already labels 81.93% of the entities of the human being type. Moreover, there are already successful precedents linking Wikidata and Wikipedia, such as infotables or authority records. Finally, the Wikidata administration has tools that facilitate the editing of data and, in turn, promote the incorporation of the gender perspective through the property that includes greater gender diversity, making identities such as agender, female, male, intersex, non-binary, transgender female and transgender male, among others, visible.

Therefore, this report proposes a knowledge organisation system that addresses two fundamental aspects of the problem of gender bias in Wikipedia. On the one hand, the technical aspects, which should make it possible to retrieve content according to the information needs of users, thus facilitating the visualisation of different gender identities, as well as any other aspect necessary to retrieve the information. And on the other hand, the socio-cultural aspects, which, through the academic and technical arguments provided in this report, can help in the community governance processes and support the evolution of the Wikipedian culture towards more inclusive values, facilitating the access and visibility of minoritised contents and groups.

You can consult the full report in open access at:

Ferran-Ferrer, Núria; Centelles, Miquel; Macià, Yessica; Juan-José Boté-Vericad; Minguillón, Julià (2024). Dones de categoria: anàlisi del biaix de gènere a les categories de Viquipèdia. Xarxa Vives d’Universitats; Centre de Recerca en Informació, Comunicació i Cultura (CRICC), Universitat de Barcelona. http://hdl.handle.net/2445/208281

© 2024 CRICC · Legal notice · Privacy policy · Cookies policy · Developed by Luzerta