Refine
Document Type
- Article (2)
- Book (1)
- Part of a Book (1)
- Conference Proceeding (1)
Has Fulltext
- yes (5)
Is part of the Bibliography
- yes (5) (remove)
Keywords
- corpus analysis (5) (remove)
Publicationstate
Reviewstate
- Peer-Review (4)
- (Verlags)-Lektorat (1)
CLARIN, the "Common Language Resources and Technology Infrastructure", has established itself as a major player in the field of research infrastructures for the humanities. This volume provides a comprehensive overview of the organization, its members, its goals and its functioning, as well as of the tools and resources hosted by the infrastructure. The many contributors representing various fields, from computer science to law to psychology, analyse a wide range of topics, such as the technology behind the CLARIN infrastructure, the use of CLARIN resources in diverse research projects, the achievements of selected national CLARIN consortia, and the challenges that CLARIN has faced and will face in the future.
The book will be published in 2022, 10 years after the establishment of CLARIN as a European Research Infrastructure Consortium by the European Commission (Decision 2012/136/EU).
Beyond Citations: Corpus-based Methods for Detecting the Impact of Research Outcomes on Society
(2020)
This paper proposes, implements and evaluates a novel, corpus-based approach for identifying categories indicative of the impact of research via a deductive (top-down, from theory to data) and an inductive (bottom-up, from data to theory) approach. The resulting categorization schemes differ in substance. Research outcomes are typically assessed by using bibliometric methods, such as citation counts and patterns, or alternative metrics, such as references to research in the media. Shortcomings with these methods are their inability to identify impact of research beyond academia (bibliometrics) and considering text-based impact indicators beyond those that capture attention (altmetrics). We address these limitations by leveraging a mixed-methods approach for eliciting impact categories from experts, project personnel (deductive) and texts (inductive). Using these categories, we label a corpus of project reports per category schema, and apply supervised machine learning to infer these categories from project reports. The classification results show that we can predict deductively and inductively derived impact categories with 76.39% and 78.81% accuracy (F1-score), respectively. Our approach can complement solutions from bibliometrics and scientometrics for assessing the impact of research and studying the scope and types of advancements transferred from academia to society.
The user interfaces for corpus analysis platforms must provide a high degree of accessibility for ordinary users and at the same time provide the possibility to answer complex research questions. In this paper, we present the design concepts behind the user interface of KorAP, a corpus analysis platform that has evolved into the main gateway to CoRoLa, the Reference Corpus of Contemporary Romanian Language. Based on established principles of user interface design, we show how KorAP addresses the challenge of providing a user-friendly interface for heterogeneous corpus data to a wide range of users with different research questions.
The article shows how the topic of dictionaries can be dealt with in German language teaching and how this subject has the potential to acquaint learners with a descriptive and data-driven perspective on language. The project Denkwerk, realized as cooperation among the Institute for German Language, the University of Mannheim and two regional secondary schools, fostered the students’ intellectual
curiosity and their interest in discovering linguistic details. Using empirical methods like corpus analysis, pupils learned both how to write wiki-based dictionary articles on their own and how to publish them in the Denktionary, the dictionary of the project. Our contribution describes the didactic and organisational framework of the project, its aims and contents, its schedule of events, as well as the structure of dictionary articles in the Denktionary, and the observed advantages of such a wikibased system.
The actual or anticipated impact of research projects can be documented in scientific publications and project reports. While project reports are available at varying level of accessibility, they might be rarely used or shared outside of academia. Moreover, a connection between outcomes of actual research project and potential secondary use might not be explicated in a project report. This paper outlines two methods for classifying and extracting the impact of publicly funded research projects. The first method is concerned with identifying impact categories and assigning these categories to research projects and their reports by extension by using subject matter experts; not considering the content of research reports. This process resulted in a classification schema that we describe in this paper. With the second method which is still work in progress, impact categories are extracted from the actual text data.