Volltext-Downloads (blau) und Frontdoor-Views (grau)

Towards a multilingual dictionary of discourse markers. Automatic extraction of units from parallel corpus

  • This paper presents a multilingual dictionary project of discourse markers. During its first stage, consisting of collecting the list of headwords, we used a parallel corpus to automatically extract units from texts written in Spanish, Catalan, English, French and German. We also applied a method to create a taxonomy structure for automatically organising the markers in clusters. As a result, we obtain an extensive, corpus-driven list of headwords. We present a prototype of the microstructure of the dictionary in the form of a standard XML database and describe the procedure to automatically fill in most of its fields (e.g., the type of DM, the equivalents in other languages, etc.), before human intervention.

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Irene Renau, Rogelio Nazar
URN:urn:nbn:de:bsz:mh39-111830
URL:https://euralex2022.ids-mannheim.de/wp-content/uploads/2022/07/Proceedings_11.07.2022.pdf
DOI:https://doi.org/10.14618/ids-pub-11183
ISBN:978-3-937241-87-6
Parent Title (English):Dictionaries and Society. Proceedings of the XX EURALEX International Congress, 12-16 July 2022, Mannheim, Germany
Publisher:IDS-Verlag
Place of publication:Mannheim
Editor:Annette Klosa-Kückelhaus, Stefan Engelberg, Christine Möhrs, Petra Storjohann
Document Type:Part of a Book
Language:English
Year of first Publication:2022
Date of Publication (online):2022/08/18
Publishing Institution:Leibniz-Institut für Deutsche Sprache (IDS)
Publicationstate:Veröffentlichungsversion
Reviewstate:Peer-Review
Tag:Computational lexicography; corpus-driven lexicography; discourse markers; multilingual lexicography
GND Keyword:Diskursmarker; Elektronisches Wörterbuch; Korpus <Linguistik>; Lexikographie; Mehrsprachiges Wörterbuch
First Page:262
Last Page:272
DDC classes:400 Sprache / 420 Englisch
Open Access?:ja
Linguistics-Classification:Lexikografie
Conferences, Workshops:Dictionaries and Society. Proceedings of the XX EURALEX International Congress, 12-16 July 2022, Mannheim, Germany
Licence (German):License LogoCreative Commons - CC BY-SA - Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International