TY - CHAP U1 - Buchbeitrag A1 - Gantar, Polona A1 - Krek, Simon ED - Klosa-Kückelhaus, Annette ED - Engelberg, Stefan ED - Möhrs, Christine ED - Storjohann, Petra T1 - Creating the lexicon of multi-word expressions for Slovene methodology and structure T2 - Dictionaries and Society. Proceedings of the XX EURALEX International Congress, 12-16 July 2022, Mannheim, Germany N2 - This paper describes a method for automatic identification of sentences in the Gigafida corpus containing multi-word expressions (MWEs) from the list of 5,242 phraseological units, which was developed on the basis of several existing open-access lexical resources for Slovene. The method is based on a definition of MWEs, which includes information on two levels of corpus annotation: syntax (dependency parsing) and morphology (POS tagging), together with some additional statistical parameters. The resulting lexicon contains 12,358 sentences containing MWEs extracted from the corpus. The extracted sentences were analysed from the lexicographic point of view with the aim of establishing canonical forms of MWEs and semantic relations between them in terms of variation, synonymy, and antonymy. KW - Lower Sorbian KW - historical lexicography KW - minority language KW - e-lexicography KW - lexical information system KW - text corpus KW - Sorbian institute KW - language portal KW - Mehrworteinheit KW - Sorbisch KW - Minderheitensprache KW - historische Lexikographie Y1 - 2022 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-112270 UR - https://euralex2022.ids-mannheim.de/wp-content/uploads/2022/07/Proceedings_11.07.2022.pdf SN - 978-3-937241-87-6 SB - 978-3-937241-87-6 U6 - https://doi.org/10.14618/ids-pub-11227 DO - https://doi.org/10.14618/ids-pub-11227 SP - 549 EP - 562 PB - Ids-Verlag CY - Mannheim ER -