Refine
Year of publication
- 2011 (1)
Document Type
- Article (1)
Language
- English (1)
Has Fulltext
- yes (1)
Is part of the Bibliography
- no (1)
Keywords
- Comparable Corpus (1)
- Kontrastive Grammatik (1)
- Korpus <Linguistik> (1)
- Multilingual Corpus (1)
- POS-Tagging (1)
- Wikipedia (1)
- XSLT (1)
Publisher
To build a comparable Wikipedia corpus of German, French, Italian, Norwegian, Polish and Hungarian for contrastive grammar research, we used a set of XSLT stylesheets to transform the mediawiki anntations to XML. Furthermore, the data has been amnntated with word class information using different taggers. The outcome is a corpus with rich meta data and linguistic annotation that can be used for multilingual research in various linguistic topics.