Refine
Document Type
- Conference Proceeding (10)
- Article (3)
- Part of a Book (1)
Language
- English (14) (remove)
Has Fulltext
- yes (14)
Keywords
- Deutsch (6)
- Französisch (5)
- Fremdsprachenlernen (4)
- Kempelen, Wolfgang von (4)
- Phonetik (4)
- automatische Sprachproduktion (4)
- historische Phonetik (4)
- German (3)
- Korpus <Linguistik> (3)
- language learning (3)
Publicationstate
- Veröffentlichungsversion (8)
- Postprint (2)
Reviewstate
- (Verlags)-Lektorat (1)
- Peer-Review (1)
- Peer-review (1)
Publisher
- International Speech Communication Association (3)
- European Language Resources Association (2)
- TUDpress (2)
- INRIA (1)
- ISCA (1)
- Kluwer (1)
- Université de Strasbourg (1)
Scientific interest in von Kempelen's 'speaking machine' stems mainly from a general interest in the history of science. This study, however, is devoted to the question of what relevance the 'speaking machine' has today. Apart for discussing why it fascinates researchers and non-researchers alike we describe the potential of replicas as an instrument for demonstration and for researching speech generation.
The paper reports on experiments with acoustic recordings of a self-built replica of the historic speaking machine of Wolfgang von Kempelen. Several possibilities of the reed as the glottal excitation mechanism were tested. Perception tests with naïve listeners revealed that the machinegenerated words 'mama' and 'papa' were partially recognised as an authentic child voice – as it was also the case in von Kempelen's demonstrations in the late 18th century.
This article presents preliminary results indicating that speakers have a different pitch range when they speak a foreign language compared to the pitch variation that occurs when they speak their native language. To this end, a learner corpus with French and German speakers was analyzed. Results suggest that speakers indeed produce a smaller pitch range in the respective L2. This is true for both groups of native speakers. A possible explanation for this finding is that speakers are less confident in their productions, therefore, they concentrate more on segments and words and subsequently refrain from realizing pitch range more native-like. For language teaching, the results suggest that learners should be trained extensively on the more pronounced use of pitch in the foreign language.
In order to determine priorities for the improvement of timing in synthetic speech this study looks at the role of segmental duration prediction and the role of phonological symbolic representation in the perceptual quality of a text-to-speech system. In perception experiments using German speech synthesis, two standard duration models (Klatt rules and CART) were tested. The input to these models consisted of a symbolic representation which was either derived from a database or a text-to-speech system. Results of the perception experiments show that different duration models can only be distinguished when the symbolic representation is appropriate. Considering the relative importance of the symbolic representation, post-lexical segmental rules were investigated with the outcome that listeners differ in their preferences regarding the degree of segmental reduction. As a conclusion, before fine-tuning the duration prediction, it is important to derive an appropriate phonological symbolic representation in order to improve timing in synthetic speech.
Scientific interest in von Kempelen's 'speaking machine' stems mainly from a general interest in the history of science. This study, however, is devoted to the question of what relevance the 'speaking machine' has today. Apart for discussing why it fascinates researchers and non-researchers alike we describe the construction of a replica and its potential as an instrument for demonstration and for researching speech generation.
The Perceptual Effect of L1 Prosody Transplantation on L2 Speech: The Case of French Accented German
(2016)
Research has shown that language learners are not only challenged by segmental differences between their native language (L1) and the second language (L2). They also have problems with the correct production of suprasegmental structures, like phone/syllable duration and the realization of pitch. These difficulties often lead to a perceptible foreign accent. This study investigates the influence of prosody transplantation on foreign accent ratings. Syllable duration and pitch contour were transferred from utterances of a male and female German native speaker to utterances of ten French native speakers speaking German. Acoustic measurements show that French learners spoke with a significantly lower speaking rate. As expected, results of a perception experiment judging the accentedness of 1) German native utterances, 2) unmanipulated and 3) manipulated utterances of French learners of German suggest that the transplantation of the prosodic features syllable duration and pitch leads to a decrease in accentedness rating. These findings confirm results found in similar studies investigating prosody transplantation with different L1 and L2 and provide a beneficial technique for (computer-assisted) pronunciation training.
The IFCASL corpus is a French-German bilingual phonetic learner corpus designed, recorded and annotated in a project on individualized feedback in computer-assisted spoken language learning. The motivation for setting up this corpus was that there is no phonetically annotated and segmented corpus for this language pair of comparable of size and coverage. In contrast to most learner corpora, the IFCASL corpus incorporate data for a language pair in both directions, i.e. in our case French learners of German, and German learners of French. In addition, the corpus is complemented by two sub-corpora of native speech by the same speakers. The corpus provides spoken data by about 100 speakers with comparable productions, annotated and segmented on the word and the phone level, with more than 50% manually corrected data. The paper reports on inter-annotator agreement and the optimization of the acoustic models for forced speech-text alignment in exercises for computer-assisted pronunciation training. Example studies based on the corpus data with a phonetic focus include topics such as the realization of /h/ and glottal stop, final devoicing of obstruents, vowel quantity and quality, pitch range, and tempo.
In mechanical speech synthesis from the 18th up to the 20th century, reed pipes were mainly used for the generation of the voice and the organ stop vox humana was central in this process. This has been described in different historical documents which report that the vox humana in some organs sounded like human vowels. In this study, tones of four different voces humanae were recorded to investigate their similarity to human vowels. The acoustical and perceptual analysis revealed that some, though not all, tones show a high similarity to selected vowels.
In order to determine priorities for the improvement of timing in synthetic speech this study looks at the role of segmental duration prediction and the role of phonological symbolic representation in listeners' preferences. In perception experiments using German speech synthesis, two standard duration models (Klatt rules and CART) were tested. The input to these models consisted of symbolic strings which were either derived from a database or a text-to-speech system. Results of the perception experiments show that different duration models can only be distinguished when the symbolic string is appropriate. Considering the relative importance of the symbolic representation, "post-lexical" segmental rules were investigated with the outcome that listeners differ in their preferences regarding the degree of segmental reduction. As a conclusion, before fine-tuning the duration prediction, it is important to calculate an appropriate phonological symbolic representation in order to improve timing in synthetic speech.
The aim of this study is to select and formulate criteria for the assessment of tools and exercises that are using computer-assisted pronunciation training (CAPT). We examined ten different CAPT tools selected on the basis of an informal questionnaire among 10 colleagues working in a German-French CAPT project. Although the applied assessment must still be regarded as informal, and although the selected CAPT tools might not be an optimal sample for representing the state of the art, the results clearly show that there is a lot to improve regarding the clarity of instruction, the quality of exercises, the robustness of the diagnosis, the clarity and appropriateness of scoring, the diversity of feedback methods, the assumed benefit for various types of users as well as the usage of ASR. Despite various good approaches regarding graphics and game-like exercises there are obviously missing links between the pedagogical expertise in phonetic training on the one hand, and software development including usability engineering on the other.