TY  - CHAP
U1  - Buchbeitrag
A1  - Brunner, Annelen
A1  - Engelberg, Stefan
A1  - Jannidis, Fotis
A1  - Tu, Ngoc Duyen Tanja
A1  - Weimer, Lukas
ED  - Calzolari, Nicoletta
ED  - Béchet, Frédéric
ED  - Blache, Philippe
ED  - Choukri, Khalid
ED  - Cieri, Christopher
ED  - Declerck, Thierry
ED  - Goggi, Sara
ED  - Isahara, Hitoshi
ED  - Maegaard, Bente
ED  - Mariani, Joseph
ED  - Mazo, Hélène
ED  - Moreno, Asuncion
ED  - Odijk, Jan
ED  - Piperidis, Stelios
T1  - Corpus REDEWIEDERGABE
T2  - Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC), May 11-16, 2020, Palais du Pharo, Marseille, France
N2  - This article presents the corpus REDEWIEDERGABE, a German-language historical corpus with detailed annotations for speech, thought and writing representation (ST&WR). With approximately 490,000 tokens, it is the largest resource of its kind. It can be used to answer literary and linguistic research questions and serve as training material for machine learning. This paper describes the composition of the corpus and the annotation structure, discusses some methodological decisions and gives basic statistics about the forms of ST&WR found in this corpus.
KW  - corpus
KW  - annotation
KW  - speech thought writing representation
KW  - machine learning
KW  - Annotation
KW  - Korpus <Linguistik>
KW  - Maschinelles Lernen
KW  - Redeerwähnung
KW  - Methodik
Y1  - 2020
U6  - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-98963
UN  - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-98963
UR  - http://www.lrec-conf.org/proceedings/lrec2020/index.html#803
SN  - 979-10-95546-34-4
SB  - 979-10-95546-34-4
N1  - Gefördert durch den Open-Access-Monografienfonds der Leibniz-Gemeinschaft
SP  - 803
EP  - 812
PB  - European Language Resources Association
CY  - Paris
ER  -