Refine
Year of publication
- 2009 (2) (remove)
Document Type
- Doctoral Thesis (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Keywords
- Deutsch (1)
- Französisch (1)
- German (1)
- Gespräch (1)
- Gestik (1)
- Interaktion (1)
- Interaktionale Linguistik (1)
- Konversation (1)
- Konversationsanalyse (1)
- Korpus <Linguistik> (1)
Publicationstate
Reviewstate
- Qualifikationsarbeit (Dissertation, Habilitationsschrift) (2) (remove)
Publisher
Le chevauchement, c’est-à-dire la prise de parole simultanée d'au moins deux locuteurs, est un phénomène omniprésent dans la conversation. Inscrit dans le cadre théorique de l'Analyse Conversationnelle et de la linguistique interactionnelle, notre travail se penche sur la parole simultanée considérée comme un phénomène systématique et ordonné qui appartient aux pratiques routinières de l'alternance des tours de parole. Nos analyses se fondent sur des transcriptions d'enregistrements vidéo de données interactionnelles naturelles, des conversations ordinaires en français et en allemand. Nous ne portons pas uniquement un regard sur le chevauchement en tant que phénomène audible, mais le concevons comme une pratique incarnée en interaction, qui est également implémentée par des ressources visibles. À l'analyse séquentielle s'ajoute donc une analyse multimodale, qui nous permet de tenir compte des constellations participatives dynamiques lors du chevauchement. Le travail analytique se focalise sur trois phénomènes spécifiques dans lesquels la parole simultanée intervient de manière significative : d'abord l'auto-répétition faisant suite au chevauchement, ensuite l'abandon de tour de parole d'un locuteur lors de la parole simultanée et enfin la complétion différée, la continuation retardée d'une prise de parole en chevauchement avec l'intervention d'un interlocuteur. Cette thèse contribue à une compréhension approfondie de ces trois phénomènes et démontre que l'organisation de la parole simultanée est étroitement liée à la gestion de trajectoires d'action complexes et de cadres participatifs dynamiques.
Manual development of deep linguistic resources is time-consuming and costly and therefore often described as a bottleneck for traditional rule-based NLP. In my PhD thesis I present a treebank-based method for the automatic acquisition of LFG resources for German. The method automatically creates deep and rich linguistic presentations from labelled data (treebanks) and can be applied to large data sets. My research is based on and substantially extends previous work on automatically acquiring wide-coverage, deep, constraint-based grammatical resources from the English Penn-II treebank (Cahill et al.,2002; Burke et al., 2004; Cahill, 2004). Best results for English show a dependency f-score of 82.73% (Cahill et al., 2008) against the PARC 700 dependency bank, outperforming the best hand-crafted grammar of Kaplan et al. (2004). Preliminary work has been carried out to test the approach on languages other than English, providing proof of concept for the applicability of the method (Cahill et al., 2003; Cahill, 2004; Cahill et al., 2005). While first results have been promising, a number of important research questions have been raised. The original approach presented first in Cahill et al. (2002) is strongly tailored to English and the datastructures provided by the Penn-II treebank (Marcus et al., 1993). English is configurational and rather poor in inflectional forms. German, by contrast, features semi-free word order and a much richer morphology. Furthermore, treebanks for German differ considerably from the Penn-II treebank as regards data structures and encoding schemes underlying the grammar acquisition task. In my thesis I examine the impact of language-specific properties of German as well as linguistically motivated treebank design decisions on PCFG parsing and LFG grammar acquisition. I present experiments investigating the influence of treebank design on PCFG parsing and show which type of representations are useful for the PCFG and LFG grammar acquisition tasks. Furthermore, I present a novel approach to cross-treebank comparison, measuring the effect of controlled error insertion on treebank trees and parser output from different treebanks. I complement the cross-treebank comparison by providing a human evaluation using TePaCoC, a new testsuite for testing parser performance on complex grammatical constructions. Manual evaluation on TePaCoC data provides new insights on the impact of flat vs. hierarchical annotation schemes on data-driven parsing. I present treebank-based LFG acquisition methodologies for two German treebanks. An extensive evaluation along different dimensions complements the investigation and provides valuable insights for the future development of treebanks.