Volltext-Downloads (blau) und Frontdoor-Views (grau)

Speaker Attribution in Cabinet Protocols

  • Historical cabinet protocols are a useful resource which enable historians to identify the opinions expressed by politicians on different subjects and at different points of time. While cabinet protocols are often available in digitized form, so far the only method to access their information content is by keyword-based search, which often returns sub-optimal results. We present a method for enriching German cabinet protocols with information about the originators of statements. This requires automatic speaker attribution. In order to avoid costly manual annotation of training data, we design a rule-based system which exploits morpho-syntactic cues. Unlike many other approaches, our method can also deal with cases in which the speaker is not explicitly identified in the sentence itself. This is an important capability as 45% of all sentences in the data constitute reported speech whose speakers are not explicitly marked. Our system is able to detect implicit speakers by taking into account signals of speaker continuity. We show that such a system obtains good results, especially with respect to recall which is particularly important for information access.

Export metadata

Additional Services

Share in Twitter Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Josef RuppenhoferGND, Caroline Sporleder, Fabian Shirokov
URN:urn:nbn:de:bsz:mh39-52960
URL:http://lexitron.nectec.or.th/public/LREC-2010_Malta/summaries/434.html
ISBN:2-9517408-6-7
Parent Title (English):Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)
Publisher:European Language Resources Association
Document Type:Conference Proceeding
Language:English
Year of first Publication:2010
Date of Publication (online):2016/09/22
Publicationstate:Veröffentlichungsversion
Tag:Digital Library; Information Extraction; Information Retrieval; Metadata
First Page:2510
Last Page:2515
Dewey Decimal Classification:400 Sprache / 410 Linguistik
Linguistics-Classification:Computerlinguistik
Open Access?:Ja
Licence (German):Es gilt das UrhG