TY - JOUR U1 - Zeitschriftenartikel, wissenschaftlich - begutachtet (reviewed) A1 - Bubenhofer, Noah A1 - Haupt, Stefanie A1 - Schwinn, Horst T1 - A comparable Wikipedia corpus: from wiki syntax to POS tagged XML JF - [Arbeiten zur Mehrsprachigkeit / B] Arbeiten zur Mehrsprachigkeit = Working papers in multilingualism / Sonderforschungsbereich 538 Mehrsprachigkeit 538, Universität Hamburg N2 - To build a comparable Wikipedia corpus of German, French, Italian, Norwegian, Polish and Hungarian for contrastive grammar research, we used a set of XSLT stylesheets to transform the mediawiki anntations to XML. Furthermore, the data has been amnntated with word class information using different taggers. The outcome is a corpus with rich meta data and linguistic annotation that can be used for multilingual research in various linguistic topics. KW - Korpus KW - Wikipedia KW - Kontrastive Grammatik KW - Comparable Corpus KW - Multilingual Corpus KW - POS-Tagging KW - XSLT Y1 - 2011 U6 - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-51897 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-51897 SN - 0176-599X SS - 0176-599X IS - 96 SP - 141 EP - 144 PB - Universität Hamburg CY - Hamburg ER -