400 Sprache, Linguistik
Refine
Document Type
- Part of a Book (2)
- Conference Proceeding (2)
- Article (1)
- Book (1)
Has Fulltext
- yes (6)
Keywords
- Komposition <Wortbildung> (6) (remove)
Publicationstate
- Veröffentlichungsversion (5)
- Postprint (1)
Reviewstate
- (Verlags)-Lektorat (3)
- Peer-Review (2)
Publisher
Komposition als Element nominaler Integration passt zum Sprachtyp des Deutschen. Diese Technik wird in verschiedenen Texttypen in unterschiedlicher Weise genutzt und funktional ausdifferenziert. Zweigliedrige Komposita prägen den alltäglichen Wortschatz. Die Erfahrung damit und ihre formale Offenheit bilden den Grund für spezifische Ausweitungen des Gebrauchs. Das wird gezeigt an der die Öffnung der Muster im literarischen Bereich, dann an der Interaktion von Kompositionstypen im Hinblick auf größtmögliche Explizitheit in juristischen Texten und letztlich an der Mischung von alltäglicher Klassifikation in gängigen Komposita und textfunktionaler Kondensierung in einem Sachtext.
KoMuX, der Kompositamuster-Explorer, (www.owid.de/plus/komux) ist eine Webanwendung, die es ermöglicht, mehr als 50.000 nominale Komposita des Deutschen gezielt nach abstrakten oder lexikalisch-teilspezifizierten Mustern zu durchsuchen. Unterschiedliche Visualisierungen helfen dabei, Strukturen und Zusammenhänge innerhalb der Ergebnismenge zu erfassen.
The automatic recognition of idioms poses a challenging problem for NLP applications. Whereas native speakers can intuitively handle multiword expressions whose compositional meanings are hard to trace back to individual word semantics, there is still ample scope for improvement regarding computational approaches. We assume that idiomatic constructions can be characterized by gradual intensities of semantic non-compositionality, formal fixedness, and unusual usage context, and introduce a number of measures for these characteristics, comprising count-based and predictive collocation measures together with measures of context (un)similarity. We evaluate our approach on a manually labelled gold standard, derived from a corpus of German pop lyrics. To this end, we apply a Random Forest classifier to analyze the individual contribution of features for automatically detecting idioms, and study the trade-off between recall and precision. Finally, we evaluate the classifier on an independent dataset of idioms extracted from a list of Wikipedia idioms, achieving state-of-the art accuracy.
Both compounds and multi-word expressions are complex lexical units, made up of at least two constituents. The most basic difference is that the former are morphological objects and the latter result from syntactic processes. However, the exact demarcation between compounds and multi-word expressions differs greatly from language to language and is often a matter of debate in and across languages. Similarly debated is whether and how these two different kinds of units complement or compete with each other.
The volume presents an overview of compounds and multi-word expressions in a variety of European languages. Central questions that are discussed for each language concern the formal distinction between compounds and multi-word expressions, their formation and their status in lexicon and grammar.
The volume contains chapters on German, English, Dutch, French, Italian, Spanish, Greek, Russian, Polish, Finnish, and Hungarian as well as a contrastive overview with a focus on German. It brings together insights from word-formation theory, phraseology and theory of grammar and aims to contribute to the understanding of the lexicon, both from a language-specific and cross-linguistic perspective.