Volltext-Downloads (blau) und Frontdoor-Views (grau)

Evaluating Workflows for Creating Orthographic Transcripts for Oral Corpora by Transcribing from Scratch or Correcting ASR-Output

  • Research projects incorporating spoken data require either a selection of existing speech corpora, or they plan to record new data. In both cases, recordings need to be transcribed to make them accessible to analysis. Underestimating the effort of transcribing can be risky. Automatic Speech Recognition (ASR) holds the promise to considerably reduce transcription effort. However, few studies have so far attempted to evaluate this potential. The present paper compares efforts for manual transcription vs. correction of ASR-output. We took recordings from corpora of varying settings (interview, colloquial talk, dialectal, historic) and (i) compared two methods for creating orthographic transcripts: transcribing from scratch vs. correcting automatically created transcripts. And (ii) we evaluated the influence of the corpus characteristics on the correcting efficiency. Results suggest that for the selected data and transcription conventions, transcribing and correcting still take equally long with 7 times real-time on average. The more complex the primary data, the more time has to be spent on corrections. Despite the impressive latest developments in speech technology, to be a real help for conversation analysts or dialectologists, ASR systems seem to require even more improvement, or we need sufficient and appropriate data for training such systems.

Export metadata

Additional Services

Search Google Scholar


Author:Jan GorischORCiDGND, Thomas Schmidt
Parent Title (English):Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Publisher:ELRA Language Resource Association
Place of publication:Paris
Editor:Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Document Type:Conference Proceeding
Year of first Publication:2024
Date of Publication (online):2024/06/04
Publishing Institution:Leibniz-Institut für Deutsche Sprache (IDS)
Tag:ASR-correction; automatic transcription; corpus curation; oral corpora; spoken German
GND Keyword:Automatische Spracherkennung; Deutsch; Gesprochene Sprache; Korpus <Linguistik>
First Page:6564
Last Page:6574
DDC classes:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
BDSL-Classification:Lexikographie, Wörterbücher
Leibniz-Classification:Sprache, Linguistik
Program areas:Pragmatik
Licence (German):License LogoCreative Commons - CC BY-NC - Namensnennung - Nicht kommerziell 4.0 International