Volltext-Downloads (blau) und Frontdoor-Views (grau)

Annotating Discourse Relations in Spoken Language: A Comparison of the PDTB and CCR Frameworks

  • In discourse relation annotation, there is currently a variety of different frameworks being used, and most of them have been developed and employed mostly on written data. This raises a number of questions regarding interoperability of discourse relation annotation schemes, as well as regarding differences in discourse annotation for written vs. spoken domains. In this paper, we describe ouron annotating two spoken domains from the SPICE Ireland corpus (telephone conversations and broadcast interviews) according todifferent discourse annotation schemes, PDTB 3.0 and CCR. We show that annotations in the two schemes can largely be mappedone another, and discuss differences in operationalisations of discourse relation schemes which present a challenge to automatic mapping. We also observe systematic differences in the prevalence of implicit discourse relations in spoken data compared to written texts,find that there are also differences in the types of causal relations between the domains. Finally, we find that PDTB 3.0 addresses many shortcomings of PDTB 2.0 wrt. the annotation of spoken discourse, and suggest further extensions. The new corpus has roughly theof the CoNLL 2015 Shared Task test set, and we hence hope that it will be a valuable resource for the evaluation of automatic discourse relation labellers.
Metadaten
Author:Ines Rehbein, Merel Scholman, Vera Demberg
URN:urn:nbn:de:bsz:mh39-56068
URL:http://www.lrec-conf.org/proceedings/lrec2016/index.html
ISBN:978-2-9517408-9-1
Parent Title (English):Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016). May 23-28, 2016. Portorož, Slovenia
Publisher:European Language Resources Association
Place of publication:Paris
Editor:Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis
Document Type:Conference Proceeding
Language:English
Year of first Publication:2016
Date of Publication (online):2016/11/21
Publicationstate:Veröffentlichungsversion
Reviewstate:(Verlags)-Lektorat
Tag:Annotation of discourse relations (DRs); DRs in spoken and written genres; Interoperability of annotation schemes
GND Keyword:Annotation; Gesprochene Sprache; Irisch; Korpus <Linguistik>
First Page:23
Last Page:28
Dewey Decimal Classification:400 Sprache / 400 Sprache, Linguistik
Leibniz-Classification:Sprache, Linguistik
Linguistics-Classification:Computerlinguistik
Open Access?:Ja
Licence (English):License LogoCreative Commons - Attribution-NonCommercial 4.0 International