TY - CHAP U1 - Buchbeitrag A1 - Lüngen, Harald A1 - Beißwenger, Michael A1 - Ehrhardt, Eric A1 - Herold, Axel A1 - Storrer, Angelika ED - Dipper, Stefanie ED - Neubarth, Friedrich ED - Zinsmeister, Heike T1 - Integrating corpora of computer-mediated communication in CLARIN-D: Results from the curation project ChatCorpus2CLARIN T2 - Proceedings of the 13th Conference on Natural Language Processing (KONVENS) N2 - We introduce our pipeline to integrate CMC and SM corpora into the CLARIN-D corpus infrastructure. The pipeline was developed by transforming an existing CMC corpus, the Dortmund Chat Corpus, into a resource conforming to current technical and legal standards. We describe how the resource has been prepared and restructured in terms of TEI encoding, linguistic annotations, and anonymisation. The output is a CLARIN-conformant resource integrated in the CLARIN-D research infrastructure. T3 - Bochumer Linguistische Arbeitsberichte - 16 KW - Deutsch KW - Chatten KW - Korpus KW - Text Encoding Initiative (TEI) Y1 - 2016 U6 - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-55743 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-55743 UR - https://www.linguistics.ruhr-uni-bochum.de/bla/ SN - 2190-0949 SS - 2190-0949 SP - 156 EP - 164 PB - Sprachwissenschaftliches Institut, Ruhr-Universität Bochum CY - Bochum ER -