Sprache im 20. Jahrhundert. Gegenwartssprache
Refine
Document Type
- Conference Proceeding (5) (remove)
Language
- English (5)
Has Fulltext
- yes (5)
Is part of the Bibliography
- no (5)
Keywords
- Artikulatorische Phonetik (2)
- Automatische Spracherkennung (2)
- Alveolar (1)
- Annotation (1)
- Automatische Lauterkennung (1)
- Englisch (1)
- Frequenz (1)
- Gesprochene Sprache (1)
- Korpus <Linguistik> (1)
- Laryngal (1)
Publicationstate
- Postprint (2)
- Veröffentlichungsversion (1)
This paper is concerned with a novel methodology for generating phonetic questions used in tree-based state tying for speech recognition. In order to implement a speech recognition system, language-dependent knowledge which goes beyond annotated material is usually required. The approach presented here generates phonetic questions for decision trees are based on a feature table that summarizes the articulatory characteristics of each sound. On the one hand, this method allows better language-specific triphone models to be defined given only a feature-table as linguistic input. On the other hand, the feature-table approach facilitates efficient definition of triphone models for other languages since again only a feature table for this language is required. The approach is exemplified with speech recognition systems for English and Thai.
This paper outlines the generation process of a specifi computational linguistic representation termed the Multilingual Time Map, conceptually a multi-tape finit state transducer encoding linguistic data at different levels of granularity. The fi st component acquires phonological data from syllable labeled speech data, the second component define feature profiles the third component generates feature hierarchies and augments the acquired data with the define feature profiles and the fourth component displays the Multilingual Time Map as a graph.
As can be shown for English data, the assimilation of the alveolar stop can result from an increased gestural overlap of the following oral closure gesture. Our experiment with German synthetic speech showed similar results. Further, it suggests that it is neccessary to complete the gestural specification of the glottal state. A voiced stop should be represented not only by an oral gesture, but by a glottal one as well.
The aim of this paper is to highlight the actual need for corpora that have been annotated based on acoustic information. The acoustic information should be coded in features or properties and is needed to inform further processing systems, i.e. to present a basis for a speech recognition system using linguistic information. Feature annotation of existing corpora in combination with segmental annotation can provide a powerful training material for speech recognition systems, but will as well challenge the further processing of features to segments and syllables. We present here the theoretical preliminaries for our multilingual feature extraction system, that we are currently working on.