Volltext-Downloads (blau) und Frontdoor-Views (grau)

Automatic question answering for the linguistic domain – An evaluation of LLM knowledge base extension with RAG

  • We investigate the extent to which Retrieval Augmented Generation improves the quality of Large Language Models’ answers to technical questions in the field of linguistics—a domain known for its broad terminological inventory and theory-dependent use of technical terms. Furthermore, this application is not only about terminological information on language, but also about information on its well-formedness. We present the results of an empirical evaluation of automatically generated answers based on authentic data from a language consulting service, with special emphasis on different question types.

This document is embargoed until:

2025/10/01

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Christian LangGND, Roman SchneiderORCiDGND, Ngoc Duyen Tanja TuORCiDGND
URN:urn:nbn:de:bsz:mh39-128138
DOI:https://doi.org/10.1007/978-3-031-70242-6_16
ISBN:978-3-031-70242-6
ISSN:1611-3349
Parent Title (English):Natural Language Processing and Information Systems. 29th International Conference on Applications of Natural Language to Information Systems, NLDB 2024, Turin, Italy, June 25–27, 2024, Proceedings, Part II
Series (Serial Number):Lecture Notes in Computer Science (14763)
Publisher:Springer
Place of publication:Cham
Editor:Amon Rapp, Luigi Di Caro, Farid Meziane, Vijayan Sugumaran
Document Type:Conference Proceeding
Language:English
Year of first Publication:2024
Date of Publication (online):2024/09/20
Publishing Institution:Leibniz-Institut für Deutsche Sprache (IDS) [Zweitveröffentlichung]
Publicationstate:Zweitveröffentlichung
Publicationstate:Postprint
Reviewstate:Peer-Review
Tag:domain specificity; large language model; quality evaluation; question answering; retrieval augmented generation
GND Keyword:Antwort; Automatische Sprachanalyse; Computerlinguistik; Großes Sprachmodell; Terminologie
First Page:161
Last Page:171
Note:
This version of the contribution has been accepted for publication, after peer review but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at https://doi.org/10.1007/978-3-031-70242-6_16. Use of this Accepted Version is subject to the publisher’s Accepted Manuscript terms of use https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms.
DDC classes:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
Linguistics-Classification:Computerlinguistik
Program areas:Grammatik
Licence (German):License LogoUrheberrechtlich geschützt