Refine
Year of publication
- 2016 (1)
Document Type
- Part of a Book (1)
Language
- English (1)
Has Fulltext
- yes (1)
Is part of the Bibliography
- no (1)
Keywords
Publicationstate
Reviewstate
- Peer-Review (1)
Publisher
- Sprachwissenschaftliches Institut, Ruhr-Universität Bochum (1) (remove)
We introduce our pipeline to integrate CMC and SM corpora into the CLARIN-D corpus infrastructure. The pipeline was developed by transforming an existing CMC corpus, the Dortmund Chat Corpus, into a resource conforming to current technical and legal standards. We describe how the resource has been prepared and restructured in terms of TEI encoding, linguistic annotations, and anonymisation. The output is a CLARIN-conformant resource integrated in the CLARIN-D research infrastructure.