Project DisKo

DisKo: Development of a diversity corpus (DisKo) as a basis for algorithmic text analysis

Specialist allocation

Computational Literary Studies

Project category

Corpus

Projektzeitraum

04/2022 – laufend

Projektstart: 04/2022 – Projektende: laufend

Short description of the project

We are building a diversity corpus (DisKo) as the basis for the algorithmic text analysis. DisKo comprises narrative texts written in the last 100 years in which not only male, female and neutral roles occur, but also descriptions of non-binary characters. The corpus serves as training material for a classifier for the automated analysis of gender roles in German-language literary texts.

Project content

DisKo stands for Diversity Corpus (German: Diversitätskorpus) and is a project in the field of Computational Literary Studies. We use machine learning to train an algorithm that not only recognises female, male and neutral roles in literary texts, but also less binary gender attributions. For this training process, we first need a training corpus that is as diverse as possible and consists of texts in which non-binary gender attributions occur. As part of a public humanities concept, as many different groups of readers as possible are involved in the acquisition of the corpus. The more diverse the people involved in the creation of the corpus, the more diverse the texts in our training corpus will ultimately be. This diversity is important in order to make gender diversity visible with our digital humanities approach. The central component of the project is a survey in which you can submit your text references. Do you know of literary works in which characters are not described stereotypically, but in a diverse way? Then take part in our survey via the link, enter text suggestions and help us to build up a diversity corpus: https://public.zenkit.com/f/klZHAjPGg/disko?v=xQoeCRIop

Sponsor

German national library

Specialist allocation

Computational Literary Studies

Project category

Corpus

Contact

Mareike.schumacher@ilw.uni-stuttgart.de

Marie.flueh@uni-hamburg.de

Find out more at
msternchenw.de/

Project team members

Mareike Schumacher (Universität Stuttgart), Marie Flüh (Universität Hamburg)

Register a new project

Add your DH research project to the project showcase by submitting a short project description via the web form. Enter project data, a brief description, a graphic or visualization as well as a detailed description of the project content with technical assignment, addressees, added value, project managers, funding information and duration.

More projects

Restaging Fashion (ReFa)

Das vom Bundesministerium für Bildung und Forschung (BMBF) geförderte und am UCLAB der FH Potsdam angesiedelte Projekt Restaging Fashion (11.2020

Weiterlesen →

Annotieren, Analysieren, Interpretieren und Visualisieren: In CATMA können Textwissenschaftler:innen so arbeiten, wie es ihren Fragestellungen am besten entspricht: qualitativ oder

Weiterlesen →

DisKo steht für Diversitäts-Korpus und ist ein literaturwissenschaftliches Projekt mit Digital-Humanities-Komponente. Mit Methoden des maschinellen Lernens wollen wir einen Algorithmus

Weiterlesen →

Oekonomische Enzyclopaedie

Digital edition of the economic-technological encyclopaedia by J. G. Krünitz, published in 242 volumes from 1773 to 1858.

Weiterlesen →

Wörterbuchnetz

Das Trierer Wörterbuchnetz bietet Zugriff auf mehr als 40 Wörterbücher und Nachschlagewerke, die entweder einzeln aufgerufen oder mittels einer übergreifenden

Weiterlesen →

Zeta und Konsorten

Der Vergleich als methodisches und epistemologisches Paradigma ist in den Geisteswissenschaften tief verankert. Ob in der qualitativen oder quantitativen Forschung

Weiterlesen →

Peter Handke Notebooks

In einem langfristigen Kooperationsprojekt von Österreichischer Nationalbibliothek und Deutschem Literaturarchiv Marbach werden alle bis 1990 entstandenen 75 Notizbücher in einer kommentierten digitalen Edition erstmals veröffentlicht und frei zugänglich gemacht.

Weiterlesen →

Project DisKo

DisKo steht für Diversitäts-Korpus und ist ein literaturwissenschaftliches Projekt mit Digital-Humanities-Komponente. Mit Methoden des maschinellen Lernens wollen wir einen Algorithmus

Weiterlesen →