Refine
Document Type
- Part of a Book (1)
- Conference Proceeding (1)
Language
- English (2)
Has Fulltext
- yes (2)
Keywords
- Korpus <Linguistik> (2)
- Annotation (1)
- Computerlinguistik (1)
- Conversation analysis (1)
- Corpus linguistics (1)
- Feedback (1)
- Französisch (1)
- French (1)
- Gesprochene Sprache (1)
- Konversationsanalyse (1)
Publicationstate
Reviewstate
- Peer-Review (1)
- Peer-review (1)
Publisher
- Association for Computational Linguistics (2) (remove)
Feedback utterances are among the most frequent in dialogue. Feedback is also a crucial aspect of linguistic theories that take social interaction, involving language, into account. This paper introduces the corpora and datasets of a project scrutinizing this kind of feedback utterances in French. We present the genesis of the corpora (for a total of about 16 hours of transcribed and phone force-aligned speech) involved in the project. We introduce the resulting datasets and discuss how they are being used in on-going work with focus on the form-function relationship of conversational feedback. All the corpora created and the datasets produced in the framework of this project will be made available for research purposes.
A syntax-based scheme for the annotation and segmentation of German spoken language interactions
(2018)
Unlike corpora of written language where segmentation can mainly be derived from orthographic punctuation marks, the basis for segmenting spoken language corpora is not predetermined by the primary data, but rather has to be established by the corpus compilers. This impedes consistent querying and visualization of such data. Several ways of segmenting have been proposed,
some of which are based on syntax. In this study, we developed and evaluated annotation and segmentation guidelines in reference to the topological field model for German. We can show that these guidelines are used consistently across annotators. We also investigated the influence of various interactional settings with a rather simple measure, the word-count per segment and unit-type. We observed that the word count and the distribution of each unit type differ in varying interactional settings and that our developed segmentation and annotation guidelines are used consistently across annotators. In conclusion, our syntax-based segmentations reflect interactional properties that are intrinsic to the social interactions that participants are involved in. This can be used for further analysis of social interaction and opens the possibility for automatic segmentation of transcripts.