Refine
Document Type
- Part of a Book (1)
- Conference Proceeding (1)
Language
- English (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2) (remove)
Keywords
- Computerunterstützte Kommunikation (2)
- Korpus <Linguistik> (2)
- CMC Corpora (1)
- Computerlinguistik (1)
- Datenmanagement (1)
- FAIR Index (1)
- Forschungsdaten (1)
- computer-mediated communication (CMC) (1)
- data dissemination (1)
- data provision (1)
Publicationstate
- Veröffentlichungsversion (2) (remove)
Reviewstate
- Peer-Review (2)
Publisher
- CLARIN (1)
- L'Harmattan (1)
The paper presents best practices and results from projects in four countries dedicated to the creation of corpora of computer-mediated communication and social media interactions (CMC). Even though there are still many open issues related to building and annotating corpora of that type, there already exists a range of accessible solutions which have been tested in projects and which may serve as a starting point for a more precise discussion of how future standards for CMC corpora may (and should) be shaped like.
In this article, we examine the current situation of data dissemination and provision for CMC corpora. By that we aim to give a guiding grid for future projects that will improve the transparency and replicability of research results as well as the reusability of the created resources. Based on the FAIR guiding principles for research data management, we evaluate the 20 European CMC corpora listed in the CLARIN CMC Resource family, individuate successful strategies among the existing corpora and establish best practices for future projects. We give an overview of existing approaches to data referencing, dissemination and provision in European CMC corpora, and discuss the methods, formats and strategies used. Furthermore, we discuss the need for community standards and offer recommendations for best practices when creating a new CMC corpus.