Volltext-Downloads (blau) und Frontdoor-Views (grau)

Detecting impact relevant sections in scientific research

  • Impact assessment is an evolving area of research that aims at measuring and predicting the potential effects of projects or programs on a variety of stakeholders. While measuring the impact of scientific research is a vibrant subdomain of impact assessment, a recurring obstacle in this specific area is the lack of an efficient framework that facilitates labeling and analysis of lengthy reports. To address this issue, we propose, implement, and evaluate a framework for automatically assessing the impact of scientific research projects by identifying pertinent sections in research reports that indicate potential impact. We leverage a mixed-method approach that combines manual annotation with supervised machine learning to extract these passages from project reports. We experiment with different machine learning algorithms, including traditional statistical models as well as pre-trained transformer language models. Our results show that our proposed method achieves accuracy scores up to 0.81, and that our method is generalizable to scientific research from different domains and different languages.

Export metadata

Additional Services

Search Google Scholar


Author:Maria BeckerORCiDGND, Kanyao HanORCiD, Antonina WerthmannGND, Rezvaneh RezapourORCiD, Haejin Lee, Jana Diesner, Andreas WittORCiDGND
Parent Title (English):The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Publisher:European Language Resources Association (ELRA)
Place of publication:Paris
Editor:Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Document Type:Conference Proceeding
Year of first Publication:2024
Date of Publication (online):2024/06/25
Publishing Institution:Leibniz-Institut f√ľr Deutsche Sprache (IDS)
Tag:annotation; impact detection; machine-learning; mixed-methods; project reports
GND Keyword:Annotation; Forschung; Infrastruktur; Sprachdaten; Wirkungsanalyse
First Page:4744
Last Page:4749
DDC classes:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
Leibniz-Classification:Sprache, Linguistik
Program areas:Digitale Sprachwissenschaft
Licence (German):License LogoCreative Commons - CC BY-NC - Namensnennung - Nicht kommerziell 4.0 International