Refine
Document Type
- Article (5)
Has Fulltext
- yes (5)
Is part of the Bibliography
- no (5)
Keywords
- Konsonant (5) (remove)
Publicationstate
- Postprint (1)
Reviewstate
- Peer-Review (2)
Publisher
- Acoustical Society of America (1)
- Kluwer (1)
In order to determine priorities for the improvement of timing in synthetic speech this study looks at the role of segmental duration prediction and the role of phonological symbolic representation in the perceptual quality of a text-to-speech system. In perception experiments using German speech synthesis, two standard duration models (Klatt rules and CART) were tested. The input to these models consisted of a symbolic representation which was either derived from a database or a text-to-speech system. Results of the perception experiments show that different duration models can only be distinguished when the symbolic representation is appropriate. Considering the relative importance of the symbolic representation, post-lexical segmental rules were investigated with the outcome that listeners differ in their preferences regarding the degree of segmental reduction. As a conclusion, before fine-tuning the duration prediction, it is important to derive an appropriate phonological symbolic representation in order to improve timing in synthetic speech.
If more than one articulator is involved in the execution of a phonetic task, then the individual articulators have to be temporally coordinated with each other in a lawful manner. The present study aims at analyzing tongue-jaw cohesion in the temporal domain for the German coronal consonants /s, b, t, d, n, l/, i.e., consonants produced with the same set of articulators—the tongue blade and the jaw—but differing in manner of articulation. The stability of obtained interaction patterns is evaluated by varying the degree of vocal effort: comfortable and loud. Tongue and jaw movements of five speakers of German were recorded by means of electromagnetic midsagittal articulography _EMMA_ during /aCa/ sequences. The results indicate that _1_ tongue-jaw coordination varies with manner of articulation, i.e., a later onset and offset of the jaw target for the stops compared to the fricatives, the nasal and the lateral; (2) the obtained patterns are stable across vocal effort conditions; (3) the sibilants are produced with smaller standard deviations for latencies and target positions; and (4) adjustments to the lower jaw positions during the surrounding vowels in loud speech occur during the closing and opening movement intervals and not the consonantal target phases.