Evaluating and Assuring Research Data Quality for Audiovisual Annotated Language Data
- This paper presents the QUEST project and describes concepts and tools that are being developed within its framework. The goal of the project is to establish quality criteria and curation criteria for annotated audiovisual language data. Building on existing resources developed by the participating institutions earlier, QUEST develops tools that could be used to facilitate and verify adherence to these criteria. An important focus of the project is making these tools accessible for researchers without substantial technical background and helping them produce high-quality data. The main tools we intend to provide are the depositors’ questionnaire and automatic quality assurance, both developed as web applications. They are accompanied by a Knowledge base, which will contain recommendations and descriptions of best practices established in the course of the project. Conceptually, we split linguistic data into three resource classes (data deposits, collections and corpora). The class of a resource defines the strictness of the quality assurance it should undergo. This division is introduced so that too strict quality criteria do not prevent researchers from depositing their data.
Author: | Timofey ArkhangelskiyGND, Hanna HedelandORCiD, Aleksandr Riaposov |
---|---|
URN: | urn:nbn:de:bsz:mh39-100750 |
URL: | https://office.clarin.eu/v/CE-2020-1738-CLARIN2020_ConferenceProceedings.pdf |
Parent Title (English): | Proceedings of CLARIN Annual Conference 2020. 05 – 07 October 2020, Online Edition |
Publisher: | CLARIN |
Place of publication: | Utrecht |
Editor: | Costanza Navarretta, Maria Eskevich |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2020 |
Date of Publication (online): | 2020/09/24 |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | Peer-Review |
GND Keyword: | Audiovisuelles Material; Datenmanagement; Datenqualität; Forschungsdaten; Korpus <Linguistik> |
First Page: | 131 |
Last Page: | 135 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Computerlinguistik |
Linguistics-Classification: | Korpuslinguistik |
Program areas: | P2: Mündliche Korpora |
Licence (English): | ![]() |