Korpuslinguistik
Refine
Document Type
- Working Paper (7) (remove)
Has Fulltext
- yes (7)
Keywords
- Korpus <Linguistik> (7)
- Deutsch (2)
- Gesprochene Sprache (2)
- Annotation (1)
- Bedienungsanleitung (1)
- Codierung (1)
- Computerlinguistik (1)
- Digital Humanities (1)
- Forschungsdaten (1)
- Frühneuhochdeutsch (1)
Publicationstate
Reviewstate
The landscape of digital lexical resources is often characterized by dedicated local portals and proprietary interfaces as primary access points for scholars and the interested public. In addition, legal and technical restrictions are potential issues that can make it difficult to efficiently query and use these valuable resources. As part of the research data consortium Text+, solutions for the storage and provision of digital language resources are being developed and provided in the context of the unified cross-domain German research data infrastructure NFDI. The specific topic of accessing lexical resources in a diverse and heterogenous landscape with a variety of participating institutions and established technical solutions is met with the development of the federated search and query framework LexFCS. The LexFCS extends the established CLARIN Federated Content Search that already allows accessing spatially distributed text corpora using a common specification of technical interfaces, data formats, and query languages. This paper describes the current state of development of the LexFCS, gives an insight into its technical details, and provides an outlook on its future development.
This introductory tutorial describes a strictly corpus-driven approach for uncovering indications for aspects of use of lexical items. These aspects include ‘(lexical) meaning’ in a very broad sense and involve different dimensions, they are established in and emerge from respective discourses. Using data-driven mathematical-statistical methods with minimal (linguistic) premises, a word’s usage spectrum is summarized as a collocation profile. Self-organizing methods are applied to visualize the complex similarity structure spanned by these profiles. These visualizations point to the typical aspects of a word’s use, and to the common and distinctive aspects of any two words.