How to connect language resources, infrastructures, and communities
- This chapter will present lessons learned from CLARIN-D, the German CLARIN national consortium. Members of the CLARIN-D communities and of the CLARIN-D consortium have been engaged in innovative, data-driven, and community-based research, using language resources and tools in the humanities and neigh-bouring disciplines. We will present different use cases and users’ stories that demonstrate the innovative research potential of large digital corpora and lexical resources for the study of language change and variation, for language documentation, for literary studies, and for the social sciences. We will emphasize the added value of making language resources and tools available in the CLARIN distributed research infrastructure and will discuss legal and ethical issues that need to be addressed in the use of such an infrastructure. Innovative technical solutions for accessing digital materials still under copyright and for data mining such materials will be presented. We will outline the need for close interaction with communities of interest in the areas of curriculum development, data management, and training the next generation of digital humanities scholars. The importance of community-supported standards for encoding language resources and the practice of community-based quality control for digital research data will be presented as a crucial step toward the provisioning of high quality research data. The chapter will conclude with a discussion of impor-tant directions for innovative research and for supporting infrastructure development over the next decade and beyond.
Author: | Christoph DraxlerGND, Alexander GeykenGND, Erhard HinrichsGND, Annette Klosa-KückelhausORCiDGND, Elke TeichGND, Thorsten TrippelORCiDGND |
---|---|
URN: | urn:nbn:de:bsz:mh39-112872 |
DOI: | https://doi.org/10.1515/9783110767377-011 |
ISBN: | 978-3-11-076737-7 |
ISSN: | 2751-1286 |
Parent Title (English): | CLARIN. The Infrastructure for language resources |
Series (Serial Number): | Digital Linguistics (1) |
Publisher: | de Gruyter |
Place of publication: | Berlin/Boston |
Editor: | Darja Fišer, Andreas Witt |
Document Type: | Part of a Book |
Language: | English |
Year of first Publication: | 2022 |
Date of Publication (online): | 2022/10/17 |
Publishing Institution: | Leibniz-Institut für Deutsche Sprache (IDS) |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | (Verlags)-Lektorat |
Tag: | CLARIN-D; humanities; research infrastructure; use cases; user communities |
GND Keyword: | Digitalisierung; Forschungsinfrastruktur; Literaturwissenschaft; Sozialwissenschaften; Sprachdaten; Sprachvariation; Technische Infrastruktur; Technischer Fortschritt |
First Page: | 275 |
Last Page: | 306 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Computerlinguistik |
Program areas: | L1: Lexikographie und Sprachdokumentation |
Program areas: | S2: Forschungskoordination und –infrastrukturen |
Licence (English): | Creative Commons - Attribution 4.0 International |