Refine
Document Type
- Conference Proceeding (32)
- Part of a Book (6)
- Book (3)
- Other (3)
- Article (2)
- Working Paper (1)
Has Fulltext
- yes (47)
Keywords
- Forschungsdaten (47) (remove)
Publicationstate
- Veröffentlichungsversion (47) (remove)
Reviewstate
- Peer-Review (47) (remove)
Publisher
- CLARIN (9)
- Zenodo (9)
- Linköping University Electronic Press (8)
- European Language Resources Association (4)
- European Language Resources Association (ELRA) (3)
- Leibniz-Institut für Deutsche Sprache (2)
- Technische Informationsbibliothek (2)
- Association for Computational Linguistics (1)
- Klostermann (1)
- L'Harmattan (1)
Poster des Text+ Partners Leibniz-Institut für Deutsche Sprache Mannheim präsentiert beim Workshop "Wohin damit? Storing and reusing my language data" am 22. Juni 2023 in Mannheim. Das Poster wurde im Kontext der Arbeit des Vereins Nationale Forschungsdateninfrastruktur (NFDI) e.V. verfasst. NFDI wird von der Bundesrepublik Deutschland und den 16 Bundesländern finanziert, und das Konsortium Text+ wird gefördert durch die Deutsche Forschungsgemeinschaft (DFG) – Projektnummer 460033370. Die Autor:innen bedanken sich für die Förderung sowie Unterstützung. Ein Dank geht außerdem an alle Einrichtungen und Akteur:innen, die sich für den Verein und dessen Ziele engagieren.
Implicitly abusive language – What does it actually look like and why are we not getting there?
(2021)
Abusive language detection is an emerging field in natural language processing which has received a large amount of attention recently. Still the success of automatic detection is limited. Particularly, the detection of implicitly abusive language, i.e. abusive language that is not conveyed by abusive words (e.g. dumbass or scum), is not working well. In this position paper, we explain why existing datasets make learning implicit abuse difficult and what needs to be changed in the design of such datasets. Arguing for a divide-and-conquer strategy, we present a list of subtypes of implicitly abusive language and formulate research tasks and questions for future research.
Um eine bessere Erreichbarkeit und Zugänglichkeit zu bestehenden sowie neuen Angeboten von Lehr- und Schulungsmaterialien im Bereich der Digital Humanities zu ermöglichen, sollten diese in einem zentralen Verzeichnis zur Verfügung gestellt werden. Im Rahmen des CLARIAH-DE Projekts wurde – zunächst für die Umsetzung eines Projektmeilensteins – eine Lösung gesucht, die eine übergreifende Suche in frei zugänglichen und nachnutzbaren Lehr- und Schulungsmaterialien zu Forschungsmethoden, Verfahren sowie Werkzeugen im Bereich der Digital Humanities in unterschiedlichen Plattformen und Repositorien bietet.
Data Management is one of the core activities of all CLARIN centres providing data and services for the academia. In PARTHENOS, European initiatives and projects in the area of the humanities and social sciences assembled to compare policies and procedures. One of the areas of interest is data management. The data management landscape shows a lot of proliferation, for which an abstraction level is introduced to help centres, such as CLARIN centres, in the process of providing the best possible services to users with data management needs.
The transfer of research data management from one institution to another infrastructural partner is all but trivial, but can be required,for instance, when an institution faces reorganisation or closure. In a case study, we describe the migration of all research data, identify the challenges we encountered, and discuss how we addressed them. It shows that the moving of research data management to another institution is a feasible, but potentially costly enterprise. Being able to demonstrate the feasibility of research data migration supports the stance of data archives that users can expect high levels of trust and reliability when it comes to data safety and sustainability.
To optimize the sharing and reuse of existing data, many funding organizations now require researchers to specify a management plan for research data. In such a plan, researchers are supposed to describe the entire life cycle of the research data they are going to produce, from data creation to formatting, interpretation, documentation, short-term storage, long-term archiving and data re-use. To support researchers with this task, we built DMPTY, a wizard that guides researchers through the essential aspects of managing data, elicits information from them, and finally, generates a document that can be further edited and linked to the original research proposal.
The Component Metadata Infrastructure (CMDI) in a project on sustainable linguistic resources
(2012)
The sustainable archiving of research data for predefined time spans has become increasingly important to researchers and is stipulated by funding organizations with the obligatory task of being observed by researchers. An important aspect in view of such a sustainable archiving of language resources is the creation of metadata, which can be used for describing, finding and citing resources. In the present paper, these aspects are dealt with from the perspectives of two projects: the German project for Sustainability of Linguistic Data at the University of Tubingen (NaLiDa, cf. http://www.sfs.uni-tuebingen.de/nalida) and the Dutch-Flemish HLT Agency hosted at the Institute for Dutch Lexicology (TST-Centrale, cf.http://www.inl.nl/tst-centrale). Both projects unfold their approaches to the creation of components and profiles using the Component Metadata Infrastructure (CMDI) as underlying metadata schema for resource descriptions, highlighting their experiences as well as advantages and disadvantages in using CMDI.
"Reproducibility crisis" and "empirical turn" are only two keywords when it comes to providing reasons for research data management. Research data is omnipresent and with the more and more automatic data processing procedures, they become even more important. However, just because new methods require data and produce data, this does not mean that data are easily accessible, reusable or even make a difference in the CV of a researcher, even if a large portion of research goes into data creation, acquisition, preparation, and analysis. In this talk I will present where we find data in the research process, where we may find appropriate support for data management and advocate for a procedure for including it in research publications and resumes.
This presentation relies on work within the BMBF-funded project CLARIN-D. It also builds on work within the German National Research Data Infrastructure (NFDI) consortium Text+, DFG project number 460033370.
Song lyrics can be considered as a text genre that has features of both written and spoken discourse, and potentially provides extensive linguistic and cultural information to scientists from various disciplines. However, pop songs play a rather subordinate role in empirical language research so far - most likely due to the absence of scientifically valid and sustainable resources. The present paper introduces a multiply annotated corpus of German lyrics as a publicly available basis for multidisciplinary research. The resource contains three types of data for the investigation and evaluation of quite distinct phenomena: TEI-compliant song lyrics as primary data, linguistically and literary motivated annotations, and extralinguistic metadata. It promotes empirically/statistically grounded analyses of genre-specific features, systemic-structural correlations and tendencies in the texts of contemporary pop music. The corpus has been stratified into thematic and author-specific archives; the paper presents some basic descriptive statistics, as well as the public online frontend with its built-in evaluation forms and live visualisations.
In this article, we describe a user support solution for the digital humanities. As a case study, we show the development of the CLARIN-D Helpdesk from 2013 into the current support solution that has been extended for several other CLARIN-related software and projects and the DARIAH-ERIC. Furthermore, we describe a way towards a common support platform for CLARIAH-DE, which is currently in the final phase. We hope to further expand the help desk in the following years in order to act as a hub for user support and a central knowledge resource for the digital humanities not only in the German, but also in the European area and perhaps at some point worldwide.