Refine
Year of publication
Document Type
- Conference Proceeding (16)
- Part of a Book (12)
- Article (6)
- Doctoral Thesis (1)
- Report (1)
- Working Paper (1)
Has Fulltext
- yes (37)
Keywords
- Urheberrecht (12)
- Forschungsdaten (11)
- Korpus <Linguistik> (11)
- Recht (10)
- Datenschutz (6)
- Datenschutz-Grundverordnung (6)
- Personenbezogene Daten (6)
- Sprachdaten (6)
- Deutsch (4)
- Digital Humanities (4)
Publicationstate
- Veröffentlichungsversion (26)
- Zweitveröffentlichung (5)
- Postprint (4)
Reviewstate
- Peer-Review (22)
- (Verlags)-Lektorat (6)
- Peer-review (1)
Publisher
- European Language Resources Association (ELRA) (6)
- CLARIN (4)
- Linköping University Electronic Press (3)
- De Gruyter (2)
- European Language Resources Association (2)
- Routledge, Taylor & Francis Group (2)
- Springer (2)
- Technische Informationsbibliothek (2)
- Association Française pour la diffusion du RIDA (1)
- BDÜ, Weiterbildungs- und Fachverlagsgesellschaft mbh (1)
This paper addresses long-term archival for large corpora. Three aspects specific to language resources are focused, namely (1) the removal of resources for legal reasons, (2) versioning of (unchanged) objects in constantly growing resources, especially where objects can be part of multiple releases but also part of different collections, and (3) the conversion of data to new formats for digital preservation. It is motivated why language resources may have to be changed, and why formats may need to be converted. As a solution, the use of an intermediate proxy object called a signpost is suggested. The approach will be exemplified with respect to the corpora of the Leibniz Institute for the German Language in Mannheim, namely the German Reference Corpus (DeReKo) and the Archive for Spoken German (AGD).
CoMParS is a resource under construction in the context of the long-term project German Grammar in European Comparison (GDE) at the IDS Mannheim. The principal goal of GDE is to create a novel contrastive grammar of German against the background of other European languages. Alongside German, which is the central focus, the core languages for comparison are English, French, Hungarian and Polish, representing different typological classes. Unlike traditional contrastive grammars available for German, which usually cover language pairs and are based on formal grammatical categories, the new GDE grammar is developed in the spirit of functionalist typology. This implies that, instead of formal criteria, cognitively motivated functional domains in terms of Givón (1984) are used as tertia comparationis. The purpose of CoMParS is to document the empirical basis of the theoretical assumptions of GDE-V and to illustrate the otherwise rather abstract content of grammar books by as many as possible naturally occurring and adequately presented multilingual examples, including information on their use in specific contexts and registers. These examples come from existing parallel corpora, and our presentation will focus on the legal aspects and consequences of this choice of language data.
The proposed contribution will shed light on current and future challenges on legal and ethical questions in research data infrastructures. The authors of the proposal will present the work of NFDI’s section on Ethical, Legal and Social Aspects (hereinafter: ELSA), whose aim is to facilitate cross-disciplinary cooperation between the NFDI consortia in the relevant areas of management and re-use of research data.
The CLARIN infrastructure as an interoperable language technology platform for SSH and beyond
(2023)
CLARIN is a European Research Infrastructure Consortium developing and providing a federated and interoperable platform to support scientists in the field of the Social Sciences and Humanities in carrying-out language-related research. This contribution provides an overview of the entire infrastructure with a particular focus on tool interoperability, ease of access to research data, tools and services, the importance of sharing knowledge within and across (national) communities, and community building. By taking into account FAIR principles from the very beginning, CLARIN succeeded in becoming a successful example of a research infrastructure that is actively used by its members. The benefits CLARIN members reap from their infrastructure secure a future for their common good that is both sustainable and attractive to partners beyond the original target groups.
Une e-Université est une université qui utilise les nouvelles technologies de l'information et de la communication (NTIC) pour remplir ses missions traditionnelles : la production, la préservation et la transmission du savoir. Ses activités consistent donc à collecter et analyser les données de recherche, à diffuser les écrits scientifiques et à fournir des ressources pédagogiques numériques. Or ces biens immatériels font souvent l'objet de droits de propriété littéraire et artistique, notamment le droit d'auteur et le droit sui generis des producteurs de bases de données. Ceci oblige les e-Universités soit à obtenir des autorisations nécessaires des titulaires des monopoles, soit à avoir recours aux exceptions légales. La recherche et l'enseignement font l'objet d'exceptions légales (cf. art. L. 122-5, 3°, e) du Code de la propriété intellectuelle (CPI) et dans les art. 52a et 53 de la Urheberrechtsgesetz (UrhG)). Toutefois, celles-ci s'avèrent manifestement insuffisantes pour accommoder les activités des e-Universités. Ainsi, les législateurs nationaux ont très récemment introduit de nouvelles exceptions visant plus spécifiquement l'utilisation des NTIC dans la recherche et l'enseignement (art. L. 122-5, 10° et art. L. 342-3, 5° du CPI et les futurs art. 60a-60h de la UrhG). Une réforme en ce sens a également été proposée par la Commission Européenne (art. 3 et 4 de la proposition de la Directive sur le droit d'auteur dans le marche unique numérique). Dans ce contexte, il est souhaitable de mener le débat sur l'introduction d'une norme ouverte (de type fair use) en droit européen. Malgré cette incertitude juridique qui entoure la matière, les e-Universités n'ont pas cessé de remplir leurs missions. En effet, la communauté académique a depuis un certain temps entrepris des efforts d'autorégulation (private ordering). Le concept d'Open Science, inspiré des valeurs traditionnelles de l'éthique scientifique, a donc émergé pour promouvoir le libre partage des données de recherche (Open Research Data), des écrits scientifiques (Open Access) et des ressources pédagogiques (Open Educational Resources). Le savoir est donc perçu comme un commun (commons), dont la préservation et le développement durable sont garantis par des standards acceptés par la communauté académique. Ces standards se traduisent en langage juridique grâce aux licences publiques, telles que les Creative Commons. Ces dernières années les universités, mais aussi les organismes finançant la recherche et même les législateurs nationaux se sont activement engagés dans la promotion des communs du savoir. Ceci s'exprime à travers des "mandats" Open Access et l'instauration d'un nouveau droit de publication secondaire, d'abord en droit allemand (art. 38(4) de la UrhG) et récemment aussi en droit français (art. L. 533-4, I du Code de la recherche).
N-grams are of utmost importance for modern linguistics and language theory. The legal status of n-grams, however, raises many practical questions. Traditionally, text snippets are considered copyrightable if they meet the originality criterion, but no clear indicators as to the minimum length of original snippets exist; moreover, the solutions adopted in some EU Member States (the paper cites German and French law as examples) are considerably different. Furthermore, recent developments in EU law (the CJEU's Pelham decision and the new right of newspaper publishers) also provide interesting arguments in this debate. The proposed paper presents the existing approaches to the legal protection of n-grams and tries to formulate some clear guidelines as to the length of n-grams that can be freely used and shared.
Hosting Providers play an essential role in the development of Internet services such as e-Research Infrastructures. In order to promote the development of such services, legislators on both sides of the Atlantic Ocean introduced “safe harbour” provisions to protect Service Providers (a category which includes Hosting Providers) from legal claims (e.g. of copyright infringement). Relevant provisions can be found in § 512 of the United States Copyright Act and in art. 14 of the Directive 2000/31/EC (and its national implementations). The cornerstone of this framework is the passive role of the Hosting Provider through which he has no knowledge of the content that he hosts. With the arrival of Web 2.0, however, the role of Hosting Providers on the Internet changed; this change has been reflected in court decisions that have reached varying conclusions in the last few years. The purpose of this article is to present the existing framework (including recent case law from the US, Germany and France).