Refine
Document Type
- Part of a Book (4)
- Conference Proceeding (4)
- Working Paper (4)
- Book (1)
- Image (1)
- Other (1)
Has Fulltext
- yes (15)
Is part of the Bibliography
- yes (15) (remove)
Keywords
- CLARIN (15) (remove)
Publicationstate
Reviewstate
- Peer-Review (6)
- (Verlags)-Lektorat (5)
Publisher
- de Gruyter (4)
- Linköping University Electronic Press (3)
- CLARIN Legal and Ethical Issues Committee (CLIC) (2)
- Zenodo (2)
- European Language Resources Association (1)
- European language resources association (ELRA) (1)
- GOEDOC, Dokumenten- und Publikationsserver der Georg-August-Universität (1)
- Universität zu Köln (1)
"Reproducibility crisis" and "empirical turn" are only two keywords when it comes to providing reasons for research data management. Research data is omnipresent and with the more and more automatic data processing procedures, they become even more important. However, just because new methods require data and produce data, this does not mean that data are easily accessible, reusable or even make a difference in the CV of a researcher, even if a large portion of research goes into data creation, acquisition, preparation, and analysis. In this talk I will present where we find data in the research process, where we may find appropriate support for data management and advocate for a procedure for including it in research publications and resumes.
This presentation relies on work within the BMBF-funded project CLARIN-D. It also builds on work within the German National Research Data Infrastructure (NFDI) consortium Text+, DFG project number 460033370.
The Data Governance Act was proposed in late 2020 as part of the European Strategy for Data, and adopted on 30 May 2022 (as Regulation 2022/868). It will enter into application on 24 September 2023. The Data governance Act is a major development in the legal framework affecting CLARIN and the whole language community. With its new rules on the re-use of data held by the public sector bodies and on the provision of data sharing services, and especially its encouragement of data altruism, the Data Governance Act creates new opportunities and new challenges for CLARIN ERIC. This paper analyses the provisions of the Data Governance Act, and aims at initiating the debate on how they will impact CLARIN and the whole language community.
CLARIAH-DE cross-service search - prospects and benefits of merging subject-specific services
(2021)
CLARIAH-DE combines services and offerings of CLARIN-D and DARIAH-DE. This includes various search applications which are made directly available to researchers. These search applications are presented in this working paper based on their main characteristics and compared with a focus on possible harmonizations. Opportunities and risks of different forms of technical integration are highlighted. Identified challenges can be explained in particular considering the background of different organizational and technical frameworks as well as highly specific and discipline-dependent requirements. The integration work that has already been carried out and the experiences gained with regard to future work and possible integration of further applications are also discussed. The experiences made in CLARIAH-DE can especially be of interest for other projects in the field of digital research infrastructures.
Preface
(2022)
CLARIN stands for “Common Language Resources and Technology Infrastructure”. In 2012 CLARIN ERIC was established as a legal entity with the mission to create and maintain a digital infrastructure to support the sharing, use, and sustainability of language data (in written, spoken, or multimodal form) available through repositories from all over Europe, in support of research in the humanities and social sciences and beyond. Since 2016 CLARIN has had the status of Landmark research infrastructure and currently it provides easy and sustainable access to digital language data and also offers advanced tools to discover, explore, exploit, annotate, analyse, or combine such datasets, wherever they are located. This is enabled through a networked federation of centres: language data repositories, service centres, and knowledge centres with single sign-on access for all members of the academic community in all participating countries. In addition, CLARIN offers open access facilities for other interested communities of use, both inside and outside of academia. Tools and data from different centres are interoperable, so that data collections can be combined and tools from different sources can be chained to perform operations at different levels of complexity. The strategic agenda adopted by CLARIN and the activities undertaken are rooted in a strong commitment to the Open Science paradigm and the FAIR data principles. This also enables CLARIN to express its added value for the European Research Area and to act as a key driver of innovation and contributor to the increasing number of industry programmes running on data-driven processes and the digitalization of society at large.
CLARIN, the "Common Language Resources and Technology Infrastructure", has established itself as a major player in the field of research infrastructures for the humanities. This volume provides a comprehensive overview of the organization, its members, its goals and its functioning, as well as of the tools and resources hosted by the infrastructure. The many contributors representing various fields, from computer science to law to psychology, analyse a wide range of topics, such as the technology behind the CLARIN infrastructure, the use of CLARIN resources in diverse research projects, the achievements of selected national CLARIN consortia, and the challenges that CLARIN has faced and will face in the future.
The book will be published in 2022, 10 years after the establishment of CLARIN as a European Research Infrastructure Consortium by the European Commission (Decision 2012/136/EU).
Towards comprehensive definitions of data quality for audiovisual annotated language resources
(2021)
Though digital infrastructures such as CLARIN have been successfully established and now provide large collections of digital resources, the lack of widely accepted standards for data quality and documentation still makes re-use of research data a difficult endeavour, especially for more complex resource types. The article gives a detailed overview over relevant characteristics of audiovisual annotated language resources and reviews possible approaches to data quality in terms of their suitability for the current context. Conclusively, various strategies are suggested in order to arrive at comprehensive and adequate definitions of data quality for this specific resource type and possibly for digital language resources in general.
The article focuses on determining responsible parties and the division of potential liability arising from sharing language data (LD) containing personal data (PD). A key issue here is to identify who has to make sure and guarantee the GDPR compliance. The authors aim to answer 1) whether an individual researcher is a controller and 2) whether sharing LD results in joint controllership or separate controllership (whether the data's transferee becomes the controller, the joint controller or the processor). The article also analyses the legal relations of parties involved in data sharing and potential liability. The final section outlines data sharing in the CLARIN context. The analysis serves as a preliminary analytical background for redesigning the CLARIN contractual framework for sharing data.
Sprachressourcen in digitaler Form liegen für ein immer breiteres Spektrum von Einzelsprachen vor. Linguistisch annotierte Korpora ermöglichen es, gezielt nach linguistischen Mustern auf der Wort-, Phrasen-, und Satzebene zu suchen und in quantitativer und qualitativer Hinsicht auszuwerten. In diesem Beitrag illustriere ich anhand von ausgewählten Beispielen den Mehrwert, den annotierte Textkorpora für die sprachwissenschaftliche Forschung bieten können. Viele der vorgestellten Sprachressourcen werden im Rahmen der CLARIN-Infrastruktur nachhaltig zur Verfügung gestellt. Die Korpora sind entweder durch Suchportale recherchierbar oder werden per Download zur Verfügung gestellt.
This technology watch report discusses digital repository solutions, in the context of the research infrastructure projects CLARIAH-DE, CLARIN, and DARIAH. It provides an overview of different repository systems, comparing them and discussing their respective applicabilities from the perspectives of the project partners at the time of writing.