Refine
Document Type
- Conference Proceeding (2) (remove)
Language
- English (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Keywords
- Deutsch (1)
- Italienisch (1)
- Korpus <Linguistik> (1)
- Mehrworteinheit (1)
- Metadaten (1)
- Phraseologie (1)
- Sprachverarbeitung (1)
Publicationstate
Reviewstate
Publisher
In this paper we present an approach to faceted search in large language resource repositories. This kind of search which enables users to browse through the repository by choosing their personal sequence of facets heavily relies on the availability of descriptive metadata for the objects in the repository. This approach therefore informs the collection of a minimal set of metatdata for language resources. The work described in this paper has been funded by the EC within the ESFRI infrastructure project CLARIN.
The authors describe two data sets submitted to the database of MWE evaluation resources: (1) cranberry expressions in English and (2) cranberry expressions in German. The first package contains a collection of 444 cranberry words in German (CWde.txt) and a collection of the corresponding cranberry expressions (CCde.txt). The second package consists of a collection of 77 cranberry words in English (CWen.txt) and a collection of the corresponding cranberry expressions (CCen.txt). The data included in these packages was extracted from the Collection of Distributionally Idiosyncratic Items (CoDII), an electronic linguistic resource of lexical items with idiosyncratic occurrence patterns. Each package contains a readme file, and can be downloaded from multiword.wiki.sourceforge.net/Resources.