(Best) Practices for Annotating and Representing CMC and Social Media Corpora in CLARIN-D
- The paper reports the results of the curation project ChatCorpus2CLARIN. The goal of the project was to develop a workflow and resources for the integration of an existing chat corpus into the CLARIN-D research infrastructure for language resources and tools in the Humanities and the Social Sciences (http://clarin-d.de). The paper presents an overview of the resources and practices developed in the project, describes the added value of the resource after its integration and discusses, as an outlook, to what extent these practices can be considered best practices which may be useful for the annotation and representation of other CMC and social media corpora.
Author: | Michael Beißwenger, Eric Ehrhardt, Axel Herold, Harald LüngenGND, Angelika Storrer |
---|---|
URN: | urn:nbn:de:bsz:mh39-55810 |
URL: | http://nl.ijs.si/janes/wp-content/uploads/2016/09/CMC-conference-proceedings-2016.pdf |
ISBN: | 978-961-237-859-2 |
Parent Title (English): | Proceedings of the 4th Conference on CMC and Social Media Corpora for the Humanities |
Publisher: | Academic Publishing Division of the Faculty of Arts of the University of Ljubljana |
Place of publication: | Ljubljana |
Editor: | Darja Fišer, Michael Beißwenger |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2016 |
Date of Publication (online): | 2016/11/16 |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | Peer-Review |
Research ressource: | http://hdl.handle.net/10932/00-03B0-14FA-A8D0-0F01-F |
Tag: | CMC corpora; TEI encoding; corpus infrastructures; legal issues; tagging |
GND Keyword: | Chatten <Kommunikation>; Deutsch; Korpus <Linguistik>; Text Encoding Initiative (TEI) |
First Page: | 7 |
Last Page: | 11 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Korpuslinguistik |
Licence (English): | Creative Commons - Attribution-ShareAlike 4.0 International |