Volltext-Downloads (blau) und Frontdoor-Views (grau)

Automatic classification of Russian texts for didactic purposes

  • In this paper we present the results of an automatic classification of Russian texts into three levels of difficulty. Our aim is to build a study corpus of Russian, in which a L2 student is able to select texts of a desired complexity. We are building on a pilot study, in which we classified Russian texts into two levels of difficulty. In the current paper, we apply the classification to an extended corpus of 577 labelled texts. The best-performing combination of features achieves an accuracy of 0,74 within at most one level difference.

Export metadata

Additional Services

Share in Twitter Search Google Scholar


Author:Dolores Batinić, Sandra Birzer, Heike Zinsmeister
Parent Title (Russian):Trudy meždunarodnoj konferencii „Korpusnaja lingvistika - 2017“. 27-30 ijunja 2017 g., Sankt-Peterburg
Publisher:Izdatel´stvo Sankt-Peterburgskogo gosudarstvennogo universiteta
Place of publication:Sankt-Peterburg
Document Type:Part of a Book
Year of first Publication:2017
Date of Publication (online):2017/10/25
Tag:L2 Russian; didactic corpus; text classification; text complexity
GND Keyword:Automatische Sprachanalyse; Fremdsprachenlernen; Korpus <Linguistik>; Russisch
First Page:9
Last Page:15
Dewey Decimal Classification:400 Sprache / 400 Sprache, Linguistik
Leibniz-Classification:Sprache, Linguistik
Open Access?:Ja
Licence (German):Es gilt das UrhG