Refine
Document Type
Language
- English (7)
Has Fulltext
- yes (7)
Is part of the Bibliography
- no (7)
Keywords
- Gesprochene Sprache (3)
- Konversationsanalyse (3)
- Korpus <Linguistik> (3)
- Annotation (2)
- Französisch (2)
- French (2)
- Prosodie (2)
- Annotator Agreement (1)
- Audio-video Synchronisation (1)
- Automatische Sprachanalyse (1)
Publicationstate
- Veröffentlichungsversion (3)
- Postprint (1)
Reviewstate
- Peer-Review (2)
- Peer-review (2)
Feedback utterances are among the most frequent in dialogue. Feedback is also a crucial aspect of all linguistic theories that take social interaction involving language into account. However, determining communicative functions is a notoriously difficult task both for human interpreters and systems. It involves an interpretative process that integrates various sources of information. Existing work on communicative function classification comes from either dialogue act tagging where it is generally coarse grained concerning the feed- back phenomena or it is token-based and does not address the variety of forms that feed- back utterances can take. This paper introduces an annotation framework, the dataset and the related annotation campaign (involving 7 raters to annotate nearly 6000 utterances). We present its evaluation not merely in terms of inter-rater agreement but also in terms of usability of the resulting reference dataset both from a linguistic research perspective and from a more applicative viewpoint.
Prosodic constructions used to compete for the speaking turn in conversation have been widely studied (French & Local (1983), Kurtić et al. (2013)). Usually, turn competition arises in overlapping talk between at least two speakers. Coordination between participants in their prosodic design of talk (Szczepek-Reed, 2006) and social action (Gorisch et al. 2012), as well as entrainment in more general terms (Levitan et al. 2011), is well established in the literature. Nevertheless, previous studies on turn competition and overlap do not investigate the prosodic design of turn competitive incomings in reference to the orientation of the speakers to each other. Rather, they assume that prosodic constructions are used for turn competition regardless of the co-participants’ design of the turn. In this paper, we ask whether the prosodic design of turn competitive talk is co-constructed between two participants talking in overlap. More specifically, we investigate whether the prosodic design of one participant’s in overlap talk is developed with respect to the interlocutor’s prosodic features during the same portion of overlapped talk, and whether this prosodic matching can discriminate between the overlaps that are competitive and those that are not. 183 Our analyses are based on two-speaker overlaps drawn from a corpus of multi-party face-to face conversation between four friends recorded in British English (Kurtic et al. 2012). 3407 instances of twospeaker overlaps have been extracted from 4 hours of talk. Two independent conversation analysts performed the interactional categorisation of overlaps into competitive and non-competitive for all these two-speaker overlap instances and achieved a good agreement of alpha=0.807 (Krippendorff 2004) as measured on a subset of 808 overlaps selected for our initial analysis. For the analysis of prosodic features we focus on F0 related features: mean, slope, span and contour, all of which have previously been shown to be used by each overlapping speaker separately for turn competition (Kurtic et al. 2009; Oertel et al. 2012). We investigate the similarity in F0 mean, slope and span by correlating these features across the two participants. For F0 contour, a similarity coefficient is computed using dynamic programming method described in Gorisch et al. (2012). We consider the difference in F0 contour similarity in competitive and non-competitive overlaps as an indication of intonational matching being a turn competitive resource. We conduct these analyses for overlaps that are clearly competitive or noncompetitive as indicated by inter-annotator agreement. In addition, we qualitatively explore those cases that annotators disagree on in order to investigate whether they reveal further important interactional or prosodic features of in-overlap talk. Our preliminary results suggest that conversational participants attend and adapt to the interlocutor during overlap depending on whether they return competition or not. We explain our findings in relation to previous work on turn competition in overlap, discuss the quantitative method employed and also address the possible consequences of our results for the study of prosodic realization of other social actions in conversation.
Precise multimodal studies require precise synchronisation between audio and video signals. However, raw audio and audio from video recordings can be out of sync for several reasons. In order to re-synchronise them, a dynamic programming (DP) approach is presented here. Traditionally, DP is performed on the rectangular distance matrix comparing each value in signal A with each value in signal B. Previous work limited the search space using for example the Sakoe Chiba Band (Sakoe and Chiba, 1978). However, the overall space of the distance matrix remains identical. Here, a tunnel matrix and its according DP-algorithm are presented. The matrix contains merely the computed distance of two signals to a pre-specified bandwidth and the computational cost is equally reduced. An example implementation demonstrates the functionality on artificial data and on data from real audio and video recordings.
Feedback utterances are among the most frequent in dialogue. Feedback is also a crucial aspect of linguistic theories that take social interaction, involving language, into account. This paper introduces the corpora and datasets of a project scrutinizing this kind of feedback utterances in French. We present the genesis of the corpora (for a total of about 16 hours of transcribed and phone force-aligned speech) involved in the project. We introduce the resulting datasets and discuss how they are being used in on-going work with focus on the form-function relationship of conversational feedback. All the corpora created and the datasets produced in the framework of this project will be made available for research purposes.
There have been several attempts to annotate communicative functions to utterances of verbal feedback in English previously. Here, we suggest an annotation scheme for verbal and non-verbal feedback utterances in French including the categories base, attitude, previous and visual. The data comprises conversations, maptasks and negotiations from which we extracted ca. 13,000 candidate feedback utterances and gestures. 12 students were recruited for the annotation campaign of ca. 9,500 instances. Each instance was annotated by between 2 and 7 raters. The evaluation of the annotation agreement resulted in an average best-pair kappa of 0.6. While the base category with the values acknowledgement, evaluation, answer, elicit and other achieves good agreement, this is not the case for the other main categories. The data sets, which also include automatic extractions of lexical, positional and acoustic features, are freely available and will further be used for machine learning classification experiments to analyse the form-function relationship of feedback.
This paper introduces the Aix Map Task corpus, a corpus of audio and video recordings of task-oriented dialogues. It was modelled after the original HCRC Map Task corpus. Lexical material was designed for the analysis of speech and prosody, as described in Astésano et al. (2007). The design of the lexical material, the protocol and some basic quantitative features of the existing corpus are presented. The corpus was collected under two communicative conditions, one audio-only condition and one face-to-face condition. The recordings took place in a studio and a sound attenuated booth respectively, with head-set microphones (and in the face-to-face condition with two video cameras). The recordings have been segmented into Inter-Pausal-Units and transcribed using transcription conventions containing actual productions and canonical forms of what was said. It is made publicly available online.