Decision tree-based evaluation of genitive classification. An empirical study on CMC and text corpora
- Contemporary studies on the characteristics of natural language benefit enormously from the increasing amount of linguistic corpora. Aside from text and speech corpora, corpora of computer-mediated communication (CMC) position themselves between orality and literacy, and beyond that provide insight into the impact of “new”, mainly internet-based media on language behaviour. In this paper, we present an empirical attempt to work with annotated CMC corpora for the explanation of linguistic phenomena. In concrete terms, we implement machine learning algorithms to produce decision trees that reveal rules and tendencies about the use of genitive markers in German.
Author: | Sandra Hansen, Roman SchneiderGND |
---|---|
DOI: | https://doi.org/10.1007/978-3-642-40722-2_8 |
ISBN: | 978-3-642-40721-5 |
Parent Title (English): | Language processing and knowledge in the web. 25th international conference, GSCL 2013, Darmstadt, Germany, September 25 - 27, 2013. Proceedings |
Series (Serial Number): | Lecture Notes in Computer Science (8105) |
Publisher: | Springer |
Place of publication: | Berlin [u.a.] |
Editor: | Iryna Gurevych, Chris Biemann, Torsten Zesch |
Document Type: | Part of a Book |
Language: | English |
Year of first Publication: | 2013 |
Date of Publication (online): | 2015/08/12 |
Tag: | Computer-Mediated Communication; Corpus Linguistics; Decision Trees; Genitive Classification; Grammar; Machine Leaming |
Volume: | 2013 |
First Page: | 83 |
Last Page: | 88 |
Note: | Dieser Beitrag ist aus urheberrechtlichen Gründen nicht frei zugänglich. |
Open Access?: | nein |
Licence (German): | Urheberrechtlich geschützt |