Volltext-Downloads (blau) und Frontdoor-Views (grau)

Managing Access to Language Resources in a Corpus Analysis Platform

  • Corpus query tools are crucial to CLARIN’s mission of facilitating the sharing and use of language data for research. It is a huge challenge for online corpus platforms to manage user access rights for large corpora with complex licenses and heterogeneous restrictions on access methods and purposes. This paper presents an approach to maximize user access to corpus data while protecting rights holders’ legitimate interests. Query rewriting techniques and authorization procedures allow for modeling license terms in detail, enabling broader applications. This offers an alternative to methods that only model a greatest common denominator of licenses, thereby limiting the possibilities for using the data. Our approach constitutes a flexible and extensible corpus license and user rights management component applicable for other language research environments.

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Eliza Margaretha IlligORCiDGND, Nils DiewaldORCiDGND, Paweł KamockiORCiDGND, Marc KupietzORCiDGND
URN:urn:nbn:de:bsz:mh39-134105
URL:https://lirias.kuleuven.be/4254504&lang=en
ISBN:978-91-8075-740-9
ISSN:1650-3740
Parent Title (English):Proceedings of: Selected papers from the CLARIN Annual Conference 2024. Barcelona, Spain, 15–17 October 2024 (= Linköping Electronic Conference Proceedings 216).
Series (Serial Number):Linköping Electronic Conference Proceedings (216)
Publisher:Linköping University Electronic Press
Place of publication:Linköping
Editor:Thalassia KontinoORCiD, Vincent VandeghinsteORCiD
Document Type:Part of a Book
Language:English
Year of first Publication:2025
Date of Publication (online):2025/08/27
Publishing Institution:Leibniz-Institut für Deutsche Sprache (IDS)
Publicationstate:Veröffentlichungsversion
Reviewstate:Peer-Review
Tag:Authorization procedures; Clarin; Query rewriting techniques
Corpus Analysis; Corpus query tools; Language data
First Page:101
Last Page:112
DDC classes:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
Linguistics-Classification:Korpuslinguistik
Program areas:Digitale Sprachwissenschaft
Licence (English):License LogoCreative Commons - Attribution 4.0 International