Refine
Year of publication
Document Type
- Conference Proceeding (12)
- Article (3)
- Part of a Book (2)
Has Fulltext
- yes (17)
Keywords
- automatische Sprachproduktion (7)
- historische Phonetik (7)
- Deutsch (6)
- Französisch (5)
- Kempelen, Wolfgang von (5)
- Fremdsprachenlernen (4)
- Phonetik (4)
- Artikulation (3)
- German (3)
- Korpus <Linguistik> (3)
Publicationstate
- Veröffentlichungsversion (8)
- Postprint (2)
- Preprint (1)
Reviewstate
- (Verlags)-Lektorat (1)
- Peer-Review (1)
- Peer-review (1)
Publisher
- TUDpress (4)
- International Speech Communication Association (3)
- European Language Resources Association (2)
- INRIA (1)
- ISCA (1)
- Kluwer (1)
- Université de Strasbourg (1)
This study presents the results of a large-scale comparison of various measures of pitch range and pitch variation in two Slavic (Bulgarian and Polish) and two Germanic (German and British English) languages. The productions of twenty-two speakers per language (eleven male and eleven female) in two different tasks (read passages and number sets) are compared. Significant differences between the language groups are found: German and English speakers use lower pitch maxima, narrower pitch span, and generally less variable pitch than Bulgarian and Polish speakers. These findings support the hypothesis that inguistic communities tend to be characterized by particular pitch profiles.
Wolfgang von Kempelen's book "The Mechanism of Human Speech" from 1791 is a famous milestone in the history of speech communication research. It has an enormous relevance for the phonetic sciences and it marks an important turning point for the development of the (mechanical) speech synthesis. So far no English version of this work was available, which excludes many interested researchers. Access to the original versions in German and French is restricted for various reasons. For example the blackletter script of the German version is troublesome for most of today's readers. We report here on a new edition of Kempelen's book which unites a better readable German version and its English translation. It will now also be in a searchable electronic format and has been enriched with many commentaries, which aid in the understanding of details of the late 18th century that are little known or unknown to many researchers today.
Der Aufsatz widmet sich einigen markanten historischen Einzelleistungen auf dem Gebiet der mechanischen Sprachsynthese, die auch heute noch faszinierend, jedoch zumeist nur in groben Zügen bekannt sind. An der hier präsentierten Auswahl erweist sich sowohl die fesselnde Kraft eines einmal als grundsätzlich praktikabel erkannten Konzeptes der stimmlichen Anregung als auch die hieraus resultierende Originalität immer neuer Ansätze, diesem Syntheseprinzip zum technologischen Durchbruch zu verhelfen.
In mechanical speech synthesis from the 18th up to the 20th century, reed pipes were mainly used for the generation of the voice and the organ stop vox humana was central in this process. This has been described in different historical documents which report that the vox humana in some organs sounded like human vowels. In this study, tones of four different voces humanae were recorded to investigate their similarity to human vowels. The acoustical and perceptual analysis revealed that some, though not all, tones show a high similarity to selected vowels.
The paper reports on experiments with acoustic recordings of a self-built replica of the historic speaking machine of Wolfgang von Kempelen. Several possibilities of the reed as the glottal excitation mechanism were tested. Perception tests with naïve listeners revealed that the machinegenerated words 'mama' and 'papa' were partially recognised as an authentic child voice – as it was also the case in von Kempelen's demonstrations in the late 18th century.
In mechanical speech synthesis reed pipes were mainly used for the generation of the voice. The organ stop "vox humana" played a central role for this concept. Historical documents report that the "vox humana" sounded like human vowels. In this study tones of four different "voces humanae" were recorded to investigate the similarity to human vowels. The acoustical and perceptual analysis revealed that some though not all tones show a high similarity to selected vowels.
In order to determine priorities for the improvement of timing in synthetic speech this study looks at the role of segmental duration prediction and the role of phonological symbolic representation in the perceptual quality of a text-to-speech system. In perception experiments using German speech synthesis, two standard duration models (Klatt rules and CART) were tested. The input to these models consisted of a symbolic representation which was either derived from a database or a text-to-speech system. Results of the perception experiments show that different duration models can only be distinguished when the symbolic representation is appropriate. Considering the relative importance of the symbolic representation, post-lexical segmental rules were investigated with the outcome that listeners differ in their preferences regarding the degree of segmental reduction. As a conclusion, before fine-tuning the duration prediction, it is important to derive an appropriate phonological symbolic representation in order to improve timing in synthetic speech.
In order to determine priorities for the improvement of timing in synthetic speech this study looks at the role of segmental duration prediction and the role of phonological symbolic representation in listeners' preferences. In perception experiments using German speech synthesis, two standard duration models (Klatt rules and CART) were tested. The input to these models consisted of symbolic strings which were either derived from a database or a text-to-speech system. Results of the perception experiments show that different duration models can only be distinguished when the symbolic string is appropriate. Considering the relative importance of the symbolic representation, "post-lexical" segmental rules were investigated with the outcome that listeners differ in their preferences regarding the degree of segmental reduction. As a conclusion, before fine-tuning the duration prediction, it is important to calculate an appropriate phonological symbolic representation in order to improve timing in synthetic speech.
Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process
(2014)
We present the design of a corpus of native and non-native speech for the language pair French-German, with a special emphasis on phonetic and prosodic aspects. To our knowledge there is no suitable corpus, in terms of size and coverage, currently available for the target language pair. To select the target L1-L2 interference phenomena we prepare a small preliminary corpus (corpus1), which is analyzed for coverage and cross-checked jointly by French and German experts. Based on this analysis, target phenomena on the phonetic and phonological level are selected on the basis of the expected degree of deviation from the native performance and the frequency of occurrence. 14 speakers performed both L2 (either French or German) and L1 material (either German or French). This allowed us to test, recordings duration, recordings material, the performance of our automatic aligner software. Then, we built corpus2 taking into account what we learned about corpus1. The aims are the same but we adapted speech material to avoid too long recording sessions. 100 speakers will be recorded. The corpus (corpus1 and corpus2) will be prepared as a searchable database, available for the scientific community after completion of the project.
The Perceptual Effect of L1 Prosody Transplantation on L2 Speech: The Case of French Accented German
(2016)
Research has shown that language learners are not only challenged by segmental differences between their native language (L1) and the second language (L2). They also have problems with the correct production of suprasegmental structures, like phone/syllable duration and the realization of pitch. These difficulties often lead to a perceptible foreign accent. This study investigates the influence of prosody transplantation on foreign accent ratings. Syllable duration and pitch contour were transferred from utterances of a male and female German native speaker to utterances of ten French native speakers speaking German. Acoustic measurements show that French learners spoke with a significantly lower speaking rate. As expected, results of a perception experiment judging the accentedness of 1) German native utterances, 2) unmanipulated and 3) manipulated utterances of French learners of German suggest that the transplantation of the prosodic features syllable duration and pitch leads to a decrease in accentedness rating. These findings confirm results found in similar studies investigating prosody transplantation with different L1 and L2 and provide a beneficial technique for (computer-assisted) pronunciation training.