Volltext-Downloads (blau) und Frontdoor-Views (grau)
The search result changed since you submitted your search request. Documents might be displayed in a different sort order.
  • search hit 35 of 687
Back to Result List

GenitivDB - a corpus-generated database for German genitive classification

  • We present a novel NLP resource for the explanation of linguistic phenomena, built and evaluated exploring very large annotated language corpora. For the compilation, we use the German Reference Corpus (DeReKo) with more than 5 billion word forms, which is the largest linguistic resource worldwide for the study of contemporary written German. The result is a comprehensive database of German genitive formations, enriched with a broad range of intra- und extralinguistic metadata. It can be used for the notoriously controversial classification and prediction of genitive endings (short endings, long endings, zero-marker). We also evaluate the main factors influencing the use of specific endings. To get a general idea about a factor’s influences and its side effects, we calculate chi-square-tests and visualize the residuals with an association plot. The results are evaluated against a gold standard by implementing tree-based machine learning algorithms. For the statistical analysis, we applied the supervised LMT Logistic Model Trees algorithm, using the WEKA software. We intend to use this gold standard to evaluate GenitivDB, as well as to explore methodologies for a predictive genitive model.

Download full text files

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Roman SchneiderGND
Url of the author's Homepage:http://www1.ids-mannheim.de/gra/personal/schneider.html
URN:urn:nbn:de:bsz:mh39-32774
URL:http://www.lrec-conf.org/proceedings/lrec2014/index.html
ISBN:978-2-9517408-8-4
Parent Title (German):LREC 2014, ninth international conference on language resources and evaluation. May 26-31, 2014, Reykjavik, Iceland
Publisher:European Language Resources Association (ELRA)
Editor:Nicoletta Calzolari
Document Type:Conference Proceeding
Language:English
Year of first Publication:2014
Date of Publication (online):2014/11/20
Contributing Corporation:European Language Resources Association
Tag:Grammar; MLP; Metadata
GND Keyword:Deutsch; Genitiv; Korpus <Linguistik>
First Page:988
Last Page:994
DDC classes:400 Sprache / 430 Deutsch
Open Access?:ja
BDSL-Classification:Grammatik
Leibniz-Classification:Sprache, Linguistik
Linguistics-Classification:Grammatikforschung
Licence (German):License LogoUrheberrechtlich geschützt