Automatic classification of Russian texts for didactic purposes
- In this paper we present the results of an automatic classification of Russian texts into three levels of difficulty. Our aim is to build a study corpus of Russian, in which a L2 student is able to select texts of a desired complexity. We are building on a pilot study, in which we classified Russian texts into two levels of difficulty. In the current paper, we apply the classification to an extended corpus of 577 labelled texts. The best-performing combination of features achieves an accuracy of 0,74 within at most one level difference.
Author: | Dolores Batinić, Sandra Birzer, Heike Zinsmeister |
---|---|
URN: | urn:nbn:de:bsz:mh39-66003 |
ISSN: | 2412-9623 |
Parent Title (Russian): | Trudy meždunarodnoj konferencii „Korpusnaja lingvistika - 2017“. 27-30 ijunja 2017 g., Sankt-Peterburg |
Publisher: | Izdatel´stvo Sankt-Peterburgskogo gosudarstvennogo universiteta |
Place of publication: | Sankt-Peterburg |
Document Type: | Part of a Book |
Language: | English |
Year of first Publication: | 2017 |
Date of Publication (online): | 2017/10/25 |
Tag: | L2 Russian; didactic corpus; text classification; text complexity |
GND Keyword: | Automatische Sprachanalyse; Fremdsprachenlernen; Korpus <Linguistik>; Russisch |
First Page: | 9 |
Last Page: | 15 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Korpuslinguistik |
Program areas: | Lexik |
Licence (German): | Urheberrechtlich geschützt |