400 Sprache, Linguistik
Refine
Document Type
- Part of a Book (3)
- Conference Proceeding (3)
- Article (2)
- Other (2)
Has Fulltext
- yes (10)
Keywords
- research infrastructure (10) (remove)
Publicationstate
Reviewstate
- Peer-Review (5)
- (Verlags)-Lektorat (3)
Publisher
This poster summarizes the results of the CLARIAH-DE Work Package 5 - Community Engagement: Outreach/Dissemination and Liaison.
Work package 5 engages with the community through dissemination activities, outreach and liaison. The work package set itself the following sub goals:
- Combining the existing dissemination and outreach activities of CLARIN-D and DARIAH-DE in a meaningful way and elaborating on them. In some cases this meant continuity, in other cases a new appearance for resources.
- Providing a web portal as a gateway to the CLARIAH-DE project.
- Creating a common identity and corporate identity and maintaining the established level of trust users already put into CLARIN-D and DARIAH-DE.
- Providing a social media presence as well as a physical presence at workshops, conferences and other meetings in the Digital Humanities.
This poster summarizes the results of the CLARIAH-DE Work Package 3: Skills Training and Promotion of Junior Researchers.
For a research field that is characterised by rapid technical development, CLARIAH-DE has to include the promotion of data literacy necessary for the efficient use of this digital research infrastructure as part of its objective. To develop, consolidate and refine a common programme in this area, work package 3 set itself the following sub goals:
- Consolidation of the activities from the previous projects into a joint service
- Cataloguing and reflecting on the methods and tools used in the research field, with the aim of identifying remaining gaps
- Skills training of, individual support for and the promotion of junior researchers
This chapter will present lessons learned from CLARIN-D, the German CLARIN national consortium. Members of the CLARIN-D communities and of the CLARIN-D consortium have been engaged in innovative, data-driven, and community-based research, using language resources and tools in the humanities and neigh-bouring disciplines. We will present different use cases and users’ stories that demonstrate the innovative research potential of large digital corpora and lexical resources for the study of language change and variation, for language documentation, for literary studies, and for the social sciences. We will emphasize the added value of making language resources and tools available in the CLARIN distributed research infrastructure and will discuss legal and ethical issues that need to be addressed in the use of such an infrastructure. Innovative technical solutions for accessing digital materials still under copyright and for data mining such materials will be presented. We will outline the need for close interaction with communities of interest in the areas of curriculum development, data management, and training the next generation of digital humanities scholars. The importance of community-supported standards for encoding language resources and the practice of community-based quality control for digital research data will be presented as a crucial step toward the provisioning of high quality research data. The chapter will conclude with a discussion of impor-tant directions for innovative research and for supporting infrastructure development over the next decade and beyond.
Preface
(2022)
CLARIN stands for “Common Language Resources and Technology Infrastructure”. In 2012 CLARIN ERIC was established as a legal entity with the mission to create and maintain a digital infrastructure to support the sharing, use, and sustainability of language data (in written, spoken, or multimodal form) available through repositories from all over Europe, in support of research in the humanities and social sciences and beyond. Since 2016 CLARIN has had the status of Landmark research infrastructure and currently it provides easy and sustainable access to digital language data and also offers advanced tools to discover, explore, exploit, annotate, analyse, or combine such datasets, wherever they are located. This is enabled through a networked federation of centres: language data repositories, service centres, and knowledge centres with single sign-on access for all members of the academic community in all participating countries. In addition, CLARIN offers open access facilities for other interested communities of use, both inside and outside of academia. Tools and data from different centres are interoperable, so that data collections can be combined and tools from different sources can be chained to perform operations at different levels of complexity. The strategic agenda adopted by CLARIN and the activities undertaken are rooted in a strong commitment to the Open Science paradigm and the FAIR data principles. This also enables CLARIN to express its added value for the European Research Area and to act as a key driver of innovation and contributor to the increasing number of industry programmes running on data-driven processes and the digitalization of society at large.
Für die sprachbasierte Forschung in den Geistes- und Sozialwissenschaften stellt CLARIN eine Forschungsinfrastruktur bereit, die auf die hochgradig heterogenen Forschungsdaten in diesen Wissenschaftsbereichen angepasst ist. Mit Werkzeugen zum Auffinden, zur standardkonformen Aufbereitung und zur nachhaltigen Aufbewahrung von Daten sowie mit der Bereitstellung von virtuellen Forschungsumgebungen zur kollaborativen Erstellung und Auswertung von Forschungsdaten unterstützt CLARIN alle wesentlichen Aspekte des Datenmanagements und der Datenarchivierung. Diese CLARIN-Angebote werden durch Beratungs- und Schulungsmaßnahmen begleitet.
The paper’s purpose is to give an overview of the work on the Component Metadata Infrastructure (CMDI) that was implemented in the CLARIN research infrastructure. It explains, the underlying schema, the accompanying tools and services. It also describes the status and impact of the CMDI developments done within the CLARIN project and past and future collaborations with other projects.
In this article, we describe a user support solution for the digital humanities. As a case study, we show the development of the CLARIN-D Helpdesk from 2013 into the current support solution that has been extended for several other CLARIN-related software and projects and the DARIAH-ERIC. Furthermore, we describe a way towards a common support platform for CLARIAH-DE, which is currently in the final phase. We hope to further expand the help desk in the following years in order to act as a hub for user support and a central knowledge resource for the digital humanities not only in the German, but also in the European area and perhaps at some point worldwide.
Interoperability in an Infrastructure Enabling Multidisciplinary Research: The case of CLARIN
(2020)
CLARIN is a European Research Infrastructure providing access to language resources and technologies for researchers in the humanities and social sciences. It supports the use and study of language data in general and aims to increase the potential for comparative research of cultural and societal phenomena across the boundaries of languages and disciplines, all in line with the European agenda for Open Science. Data infrastructures such as CLARIN have recently embarked on the emerging frameworks for the federation of infrastructural services, such as the European Open Science Cloud and the integration of services resulting from multidisciplinary collaboration in federated services for the wider domain of the social sciences and humanities (SSH). In this paper we describe the interoperability requirements that arise through the existing ambitions and the emerging frameworks. The interoperability theme will be addressed at several levels, including organisation and ecosystem, design of workflow services, data curation, performance measurement and collaboration. For each level, some concrete outcomes are described.
The DRuKoLA project
(2019)
DRuKoLA, the accompanying project in the making of the Corpus of Romanian Language, is a cooperation between German and Romanian computer scientists, corpus linguists and linguists, aiming at linking reference corpora of European languages under one corpus analysis tool able to manage big data. KorAP, the analysis tool developed at the Leibniz Institute for the German Language (Mannheim), is being tailored for the Romanian language in a first attempt to reunite reference corpora under the EuReCo initiative, detailed in this paper. The paper describes the necessary steps of harmonization within KorAP and the corpus of Romanian language and discusses, as one important goal of this project, criteria and ways to build virtual comparable corpora to be used for contrastive linguistic analyses.