Refine
Year of publication
- 2017 (2) (remove)
Document Type
- Part of a Book (1)
- Conference Proceeding (1)
Language
- English (2) (remove)
Has Fulltext
- yes (2)
Is part of the Bibliography
- yes (2)
Keywords
- Korpus <Linguistik> (2)
- Automatische Sprachanalyse (1)
- Computerunterstützte Lexikographie (1)
- Deutsch (1)
- Fremdsprachenlernen (1)
- Gesprochene Sprache (1)
- L2 Russian (1)
- Online-Wörterbuch (1)
- Russisch (1)
- corpus linguistics (1)
Publicationstate
Reviewstate
- Peer-Review (1)
In this paper we present the results of an automatic classification of Russian texts into three levels of difficulty. Our aim is to build a study corpus of Russian, in which a L2 student is able to select texts of a desired complexity. We are building on a pilot study, in which we classified Russian texts into two levels of difficulty. In the current paper, we apply the classification to an extended corpus of 577 labelled texts. The best-performing combination of features achieves an accuracy of 0,74 within at most one level difference.
This paper gives an insight into the basic concepts for a corpus-based lexical resource of spoken German, which is being developed by the project "The Lexicon of Spoken German"(Lexik des gesprochenen Deutsch, LeGeDe) at the "Institute for the German Language" (Institut für Deutsche Sprache, IDS) in Mannheim. The focus of the paper is on initial ideas of semi-automatic and automatic resources that assist the quantitative analysis of the corpus data for the creation of dictionary content. The work is based on the "Research and Teaching Corpus of Spoken German" (Forschungs- und Lehrkorpus Gesprochenes Deutsch, FOLK).