CATMA

Specialist allocation

Literary Studies

Project category

Projektzeitraum

11/2008 – 03/2026

Project start: 11/2008 - Project end: 03/2026

Short description of the project

Annotieren, Analysieren, Interpretieren und Visualisieren: In CATMA können Textwissenschaftler:innen so arbeiten, wie es ihren Fragestellungen am besten entspricht: qualitativ oder quantitativ, bottom-up und explorativ oder deskriptiv und taxonomiebasiert, einzeln oder im Team. Das webbasierte Open Source Tool ist kostenfrei und bietet außerdem die Möglichkeit, eigene Daten zu exportieren, um sie zum Beispiel mit anderen Tools weiter zu verarbeiten.
Das Tool wird seit 2008 kontinuierlich weiterentwickelt. CATMA 7 wurde innerhalb des von der Stiftung Innovation in der Hochschullehre (StIL) geförderten Projektes forTEXT Portal (https://fortext.net) entwickelt und im Mai 2023 veröffentlicht.

Project content

CATMA knows no dogma, is open source, available free of charge and web-based. The annotation tool is project-centered and allows you to work with texts in all common file formats and in most languages, including languages that are written from right to left. Users can upload their own corpus, share it with a team and work on it collaboratively. Any number of projects can be created.

In addition to the “Project” module, there are three main modules in CATMA: "Annotate", "Tags" and "Analyze":

"Annotate": Annotations are fundamental to all humanities research. They are partly declarative and taxonomy-driven, but often also interpretative or explorative in nature. CATMA does not prescribe anything in this respect and allows texts to be annotated as undogmatically as required: multiple, overlapping, short, extensive, discontinuous or even in contradictory variants - CATMA annotations can take many forms. Text passages or annotations can be described or comments documented with the help of a comment feature. It can also be used to communicate directly with the team.
"Tags": CATMA allows you to develop your own tagsets or to import an existing one. Regardless of whether the work is taxonomy-based or categories are to be created spontaneously: CATMA supports both approaches. Your own text interpretation can be developed bottom-up or top-down against the background of a predefined theory and taxonomy.
In the quantifying "Analyze" module, CATMA offers guided support for the creation of text analysis queries as well as a selection of different interactive visualizations (KeyWord in Context (KWIC), Word or Tag Cloud, Distribution Graph and Double Tree). Each visualization offers a direct path back to the source text.

Texts, annotations, query results and visualizations can be exported in various formats. In addition, access to the "raw" data is guaranteed at all times.

The Python package GitMA, which was first published in 2021 and can be seen as an extension of CATMA, uses the above-mentioned access to the raw data to enable further processing, analysis and manipulation of this data with the help of common Python data science tools. In particular, this includes functions that enable the (visual) comparison of annotations from different annotators, the calculation of inter-annotator agreement scores and the creation of gold annotations.

The associated CATMA website contains FAQs, tutorials, a manual, a glossary as well as comprehensive information on technical documentation and contact information for user support.

Sponsor

Stiftung Innovation in der Hochschullehre (Foundation for Innovation in University Teaching), Deutsche Forschungsgemeinschaft (German Research Foundation)

Specialist allocation

Literary Studies

Project category

Contact

Malte Meister
Technische Universität Darmstadt
Institut für Sprach- und Literaturwissenschaft
Residenzschloss 1
64283 Darmstadt
E-Mail: team@catma.de

Find out more at

https://catma.de

Project team members

Evelyn Gius, Malte Meister, Mari Akazawa und Dominik Gerstorfer

Register a new project

Add your DH research project to the project showcase by submitting a short project description via the web form. Enter project data, a brief description, a graphic or visualization as well as a detailed description of the project content with technical assignment, addressees, added value, project managers, funding information and duration.

More projects

Heinrich-Heine-Portal

Das Heinrich-Heine-Portal schöpft aus den Arbeitsergebnissen mehrerer Forschergenerationen, indem es die beiden historisch-kritischen Heine-Gesamtausgaben, die unabhängig voneinander in der Bundesrepublik

Weiterlesen →

Das von der Deutschen Forschungsgemeinschaft geförderte Projekt Diccionario del Español Medieval electrónico (DEMel) hat zum Ziel, der Öffentlichkeit ein lemmatisiertes

Weiterlesen →

DisKo steht für Diversitäts-Korpus und ist ein literaturwissenschaftliches Projekt mit Digital-Humanities-Komponente. Mit Methoden des maschinellen Lernens wollen wir einen Algorithmus

Weiterlesen →

DARIAH-DE Repository

Das DARIAH-DE Repository ist eine zentrale Komponente der DARIAH-DE Forschungsdaten-Föderationsarchitektur, die verschiedene Dienste und Anwendungen aggregiert und so komfortabel nutzbar

Weiterlesen →

Das Hamburger Zentrum für Sprachkorpora berät und unterstützt bei der Erstellung und Nutzung digitaler Sprachkorpora in Forschung und Lehre: Es

Weiterlesen →

Institute for Dokumentologie and Scholarly Editing

Das Institut für Dokumentologie und Editorik e.V. (IDE) ist ein internationaler Zusammenschluss von Wissenschaftlerinnen und Wissenschaftlern aus verschiedenen Disziplinen der

Weiterlesen →

FOLK

Das Forschungs- und Lehrkorpus Gesprochenes Deutsch (FOLK) wird seit 2008 am Leibniz-Institut für Deutsche Sprache aufgebaut. Das Korpus enthält Audio- und Videoaufnahmen von natürlichen Interaktionen aus unterschiedlichsten Bereichen des gesellschaftlichen Lebens (Arbeit, Freizeit, Bildung, öffentliches Leben, Dienstleistungen usw.) im deutschen Sprachraum.

Weiterlesen →

EXMARaLDA

EXMARaLDA wurde ursprünglich (2000-2011) am SFB Mehrsprachigkeit der Universität Hamburg entwickelt. Die Entwicklung von FOLKER und OrthoNormal wurde über das

Weiterlesen →