Volltext-Downloads (blau) und Frontdoor-Views (grau)

A word embedding approach to onomasiological search in multilingual loanword lexicography

  • In this paper we present an experimental semantic search function, based on word embeddings, for an integrated online information system on German lexical borrowings into other languages, the Lehnwortportal Deutsch (LWPD). The LWPD synthesizes an increasing number of lexicographical resources and provides basic cross-resource search options. Onomasiological access to the lexical units of the portal is a highly desirable feature for many research questions, such as the likelihood of borrowing lexical units with a given meaning (Haspelmath & Tadmor, 2009; Zeller, 2015). The search technology is based on multilingual pre-trained word embeddings, and individual word senses in the portal are associated with word vectors. Users may select one or more among a very large number of search terms, and the database returns lexical items with word sense vectors similar to these terms. We give a preliminary assessment of the feasibility, usability and efficacy of our approach, in particular in comparison to search options based on semantic domains or fields.

Download full text files

Export metadata

Statistics

frontdoor_oas
Metadaten
Author:Peter MeyerORCiDGND, Ngoc Duyen Tanja TuORCiDGND
URN:urn:nbn:de:bsz:mh39-106840
URL:https://elex.link/elex2021/wp-content/uploads/eLex_2021-proceedings_compressed.pdf
ISSN:2533-5626
Parent Title (English):Electronic lexicography in the 21st century: post-editing lexicography. Proceedings of the eLex 2021 conference. 5–7 July 2021, virtual.
Publisher:Lexical Computing CZ s.r.o.
Place of publication:Brno
Editor:Iztok Kosem, Michal Cukr, Miloš Jakubíček, Jelena Kallas, Simon Krek, Carole Tiberius
Document Type:Conference Proceeding
Language:English
Year of first Publication:2021
Date of Publication (online):2021/09/23
Publishing Institution:Leibniz-Institut für Deutsche Sprache (IDS)
Publicationstate:Veröffentlichungsversion
Reviewstate:Peer-Review
Tag:Lehnwortportal Deutsch (LWPD)
lexical borrowings; multilingual lexicography; onomasiological search; word embeddings
GND Keyword:Computerunterstützte Lexikografie; Datenbank; Lehnwort; Lexikografie; Mehrsprachigkeit; Onomasiologie; Semantik
First Page:78
Last Page:91
DDC classes:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
BDSL-Classification:Lexikographie, Wörterbücher
Leibniz-Classification:Sprache, Linguistik
Linguistics-Classification:Lexikografie
Program areas:L1: Lexikographie und Sprachdokumentation
Program areas:L3: Lexik empirisch und digital
Licence (English):License LogoCreative Commons - Attribution-ShareAlike 4.0 International