Volltext-Downloads (blau) und Frontdoor-Views (grau)

Automatic recognition of speech, thought, and writing representation in German narrative texts

  • This article presents the main results of a project, which explored ways to recognize and classify a narrative feature—speech, thought, and writing representation (ST&WR)—automatically, using surface information and methods of computational linguistics. The task was to detect and distinguish four types—direct, free indirect, indirect, and reported ST&WR—in a corpus of manually annotated German narrative texts. Rule-based as well as machine-learning methods were tested and compared. The results were best for recognizing direct ST&WR (best F1 score: 0.87), followed by indirect (0.71), reported (0.58), and finally free indirect ST&WR (0.40). The rule-based approach worked best for ST&WR types with clear patterns, like indirect and marked direct ST&WR, and often gave the most accurate results. Machine learning was most successful for types without clear indicators, like free indirect ST&WR, and proved more stable. When looking at the percentage of ST&WR in a text, the results of machine-learning methods always correlated best with the results of manual annotation. Creating a union or intersection of the results of the two approaches did not lead to striking improvements. A stricter definition of ST&WR, which excluded borderline cases, made the task harder and led to worse results for both approaches.

Export metadata

Additional Services

Share in Twitter Search Google Scholar


Author:Annelen BrunnerGND
Parent Title (English):Literary and Linguistic Computing
Document Type:Article
Year of first Publication:2013
Date of Publication (online):2015/07/30
Tag:Direct speech
Automatic recognition of speech; German; Indirect speech; Prose
GND Keyword:Automatische Spracherkennung; Deutsch; Direkte Rede; Indirekte Rede; Prosa
First Page:563
Last Page:575
Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG geförderten) Allianz- bzw. Nationallizenz frei zugänglich.

This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.
DDC classes:400 Sprache / 430 Deutsch
Open Access?:ja
Leibniz-Classification:Sprache, Linguistik
Licence (German):License LogoUrheberrechtlich geschützt