Multi-dimensional annotation of linguistic corpora for investigating information structure
- We present the annotation of information structure in the MULI project. To learn more about the information structuring means in prosody, syntax and discourse, theory- independent features were defined for each level. We describe the features and illustrate them on an example sentence. To investigate the interplay of features, the representation has to allow for inspecting all three layers at the same time. This is realised by a stand-off XML mark-up with the word as the basic unit. The theory-neutral XML stand-off annotation allows integrating this resource with other linguistic resources such as the Tiger Treebank for German or the Penn treebank for English.
Author: | Stefan Baumann, Caren Brinckmann, Silvia Hansen-Schirra, Geert-Jan Kruijff, Ivana Kruijff-Korbayová, Stella Neumann, Elke Teich |
---|---|
URN: | urn:nbn:de:bsz:mh39-68647 |
URL: | http://www.aclweb.org/anthology/W04-2707 |
Parent Title (English): | Proceedings of the Workshop Frontiers in Corpus Annotation at HLT-NAACL 2004. Boston, USA |
Publisher: | The Association for Computational Linguistics |
Place of publication: | Stroudsberg, PA |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2004 |
Date of Publication (online): | 2017/12/20 |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | Review-Status-unbekannt |
GND Keyword: | Annotation; Automatische Sprachanalyse; Gesprochene Sprache; Korpus <Linguistik> |
Page Number: | 8 |
DDC classes: | 400 Sprache / 430 Deutsch |
Open Access?: | ja |
BDSL-Classification: | Sprache im 20. Jahrhundert. Gegenwartssprache |
Linguistics-Classification: | Korpuslinguistik |
Licence (English): | ![]() |