Multi-dimensional annotation of linguistic corpora for investigating information structure

We present the annotation of information structure in the MULI project. To learn more about the information structuring means in prosody, syntax and discourse, theory- independent features were defined for each level. We describe the features and illustrate them on an example sentence. To investigate the interplay of features, the representation has to allow for inspecting all three layers at the same time. This is realised by a stand-off XML mark-up with the word as the basic unit. The theory-neutral XML stand-off annotation allows integrating this resource with other linguistic resources such as the Tiger Treebank for German or the Penn treebank for English.

Metadaten
Author:	Stefan Baumann, Caren Brinckmann, Silvia Hansen-Schirra, Geert-Jan Kruijff, Ivana Kruijff-Korbayová, Stella Neumann, Elke Teich
URN:	urn:nbn:de:bsz:mh39-68647
URL:	http://www.aclweb.org/anthology/W04-2707
Parent Title (English):	Proceedings of the Workshop Frontiers in Corpus Annotation at HLT-NAACL 2004. Boston, USA
Publisher:	The Association for Computational Linguistics
Place of publication:	Stroudsberg, PA
Document Type:	Conference Proceeding
Language:	English
Year of first Publication:	2004
Date of Publication (online):	2017/12/20
Publicationstate:	Veröffentlichungsversion
Reviewstate:	Review-Status-unbekannt
GND Keyword:	Annotation; Automatische Sprachanalyse; Gesprochene Sprache; Korpus <Linguistik>
Page Number:	8
DDC classes:	400 Sprache / 430 Deutsch
Open Access?:	ja
BDSL-Classification:	Sprache im 20. Jahrhundert. Gegenwartssprache
Linguistics-Classification:	Korpuslinguistik
Licence (English):	Creative Commons - Attribution 3.0 Unported

Open Access