Towards a new level of annotation detail of multilingual speech corpora

The aim of this paper is to highlight the actual need for corpora that have been annotated based on acoustic information. The acoustic information should be coded in features or properties and is needed to inform further processing systems, i.e. to present a basis for a speech recognition system using linguistic information. Feature annotation of existing corpora in combination with segmental annotation can provide a powerful training material for speech recognition systems, but will as well challenge the further processing of features to segments and syllables. We present here the theoretical preliminaries for our multilingual feature extraction system, that we are currently working on.

Metadaten
Author:	Anja Geumann
URN:	urn:nbn:de:bsz:mh39-57020
URL:	http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.385.5197&rep=rep1&type=pdf
Parent Title (English):	Proceedings of the 8th International Conference of Spoken Language Processing, Interspeech, Jeju, South Korea, 2004
Document Type:	Conference Proceeding
Language:	English
Year of first Publication:	2004
Date of Publication (online):	2016/12/13
GND Keyword:	Annotation; Automatische Spracherkennung; Gesprochene Sprache; Korpus <Linguistik>; Phonetik
First Page:	1096
Last Page:	1099
DDC classes:	400 Sprache / 400 Sprache, Linguistik
Open Access?:	ja
BDSL-Classification:	Sprache im 20. Jahrhundert. Gegenwartssprache
Linguistics-Classification:	Korpuslinguistik
Linguistics-Classification:	Phonetik / Phonologie
Licence (German):	Urheberrechtlich geschützt

Open Access