Short description of the project
We are building a diversity corpus (DisKo) as the basis for the algorithmic text analysis. DisKo comprises narrative texts written in the last 100 years in which not only male, female and neutral roles occur, but also descriptions of non-binary characters. The corpus serves as training material for a classifier for the automated analysis of gender roles in German-language literary texts.
Project content
DisKo stands for Diversity Corpus (German: Diversitätskorpus) and is a project in the field of Computational Literary Studies. We use machine learning to train an algorithm that not only recognises female, male and neutral roles in literary texts, but also less binary gender attributions. For this training process, we first need a training corpus that is as diverse as possible and consists of texts in which non-binary gender attributions occur. As part of a public humanities concept, as many different groups of readers as possible are involved in the acquisition of the corpus. The more diverse the people involved in the creation of the corpus, the more diverse the texts in our training corpus will ultimately be. This diversity is important in order to make gender diversity visible with our digital humanities approach. The central component of the project is a survey in which you can submit your text references. Do you know of literary works in which characters are not described stereotypically, but in a diverse way? Then take part in our survey via the link, enter text suggestions and help us to build up a diversity corpus: https://public.zenkit.com/f/klZHAjPGg/disko?v=xQoeCRIop
