TY - CHAP U1 - Buchbeitrag A1 - Brunner, Annelen A1 - Engelberg, Stefan A1 - Jannidis, Fotis A1 - Tu, Ngoc Duyen Tanja A1 - Weimer, Lukas ED - Calzolari, Nicoletta ED - Béchet, Frédéric ED - Blache, Philippe ED - Choukri, Khalid ED - Cieri, Christopher ED - Declerck, Thierry ED - Goggi, Sara ED - Isahara, Hitoshi ED - Maegaard, Bente ED - Mariani, Joseph ED - Mazo, Hélène ED - Moreno, Asuncion ED - Odijk, Jan ED - Piperidis, Stelios T1 - Corpus REDEWIEDERGABE T2 - Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC), May 11-16, 2020, Palais du Pharo, Marseille, France N2 - This article presents the corpus REDEWIEDERGABE, a German-language historical corpus with detailed annotations for speech, thought and writing representation (ST&WR). With approximately 490,000 tokens, it is the largest resource of its kind. It can be used to answer literary and linguistic research questions and serve as training material for machine learning. This paper describes the composition of the corpus and the annotation structure, discusses some methodological decisions and gives basic statistics about the forms of ST&WR found in this corpus. KW - corpus KW - annotation KW - speech thought writing representation KW - machine learning KW - Annotation KW - Korpus KW - Maschinelles Lernen KW - Redeerwähnung KW - Methodik Y1 - 2020 U6 - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-98963 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-98963 UR - http://www.lrec-conf.org/proceedings/lrec2020/index.html#803 SN - 979-10-95546-34-4 SB - 979-10-95546-34-4 N1 - Gefördert durch den Open-Access-Monografienfonds der Leibniz-Gemeinschaft SP - 803 EP - 812 PB - European Language Resources Association CY - Paris ER -