Von der Tonbandaufnahme zur integrierten Text-Ton-Datenbank. Instrumente für die Arbeit mit Gesprächskorpora
- The development of tools for computer-assisted transcription and analysis of extensive speech corpora is one main issue at the Institute of German Language (IDS) and the Institute of Natural Language Processing (IMS). Corpora of natural spoken dialogue have been transcribed, and the analogue recordings of these discourses are digitized. An automatic segmentation system is employed which is based on Hidden Markov Models. The orthographic representation of the speech signal is transformed into a phonetic representation, the phonetic transcription is transformed into a system-internal representation, and the time alignment between text and speech signal follows. In this article, we also describe the retrieval software Cosmas II and its special features for searching discourse transcripts and playing time aligned passages.
Author: | Franck Bodmer Mory, Marcus L. Fach, Rudolf Schmidt, Wilfried Schütte |
---|---|
URN: | urn:nbn:de:bsz:mh39-43256 |
ISBN: | 3-8233-5436-1 |
Parent Title (German): | Romanistische Korpuslingustik: Korpora und gesprochene Sprache |
Series (Serial Number): | ScriptOralia (126) |
Publisher: | Narr |
Place of publication: | Tübingen |
Editor: | Claus D. Pusch, Wolfgang Raible |
Document Type: | Part of a Book |
Language: | German |
Year of first Publication: | 2002 |
Date of Publication (online): | 2015/11/02 |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | (Verlags)-Lektorat |
GND Keyword: | Automatische Spracherkennung; Konversationsanalyse; Korpus <Linguistik>; Transkription |
First Page: | 209 |
Last Page: | 243 |
DDC classes: | 400 Sprache / 410 Linguistik |
Open Access?: | ja |
BDSL-Classification: | Sprache im 20. Jahrhundert. Gegenwartssprache |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Gesprächsforschung / Gesprochene Sprache |
Licence (German): | Urheberrechtlich geschützt |