Discourse Relations and Document Structure
- This chapter addresses the requirements and linguistic foundations of automatic relational discourse analysis of complex text types such as scientific journal articles. It is argued that besides lexical and grammatical discourse markers, which have traditionally been employed in discourse parsing, cues derived from the logical and generical document structure and the thematic structure of a text must be taken into account. An approach to modelling such types of linguistic information in terms of XML-based multi-layer annotations and to a text-technological representation of additional knowledge sources is presented. By means of quantitative and qualitative corpus analyses, cues and constraints for automatic discourse analysis can be derived. Furthermore, the proposed representations are used as the input sources for discourse parsing. A short overview of the projected parsing architecture is given.
Author: | Harald LüngenGND, Maja Bärenfänger, Mirco Hilbert, Henning LobinGND, Csilla Puskás |
---|---|
URN: | urn:nbn:de:bsz:mh39-48005 |
ISBN: | 978-90-481-3330-7 |
Parent Title (English): | Linguistic Modeling of Information and Markup Languages. Contributions to Language Technology |
Series (Serial Number): | Text, Speech and Language Technology (41) |
Publisher: | Springer |
Place of publication: | Dordrecht |
Editor: | Andreas Witt, Metzing Dieter |
Document Type: | Part of a Book |
Language: | English |
Year of first Publication: | 2010 |
Date of Publication (online): | 2016/04/25 |
Publicationstate: | Postprint |
Tag: | Discourse parsing; Discourse relations; Document structure; Linguistic annotations; Text technology; XML |
First Page: | 97 |
Last Page: | 123 |
Note: | The final publication is available at Springer via https://dx.doi.org/10.1007/978-90-481-3331-4 |
DDC classes: | 400 Sprache / 410 Linguistik |
Open Access?: | ja |
Linguistics-Classification: | Computerlinguistik |
Licence (German): | Urheberrechtlich geschützt |