Refine
Document Type
- Conference Proceeding (4)
- Part of a Book (1)
Language
- English (5)
Has Fulltext
- yes (5)
Keywords
- Computerlinguistik (3)
- Adverb (2)
- Deutsch (2)
- Adjektiv (1)
- Automatische Sprachanalyse (1)
- Automatische Sprachverarbeitung (1)
- Automatische Worterkennung (1)
- CELEX (1)
- Diminutiv (1)
- Englisch (1)
Publicationstate
Reviewstate
- Peer-Review (3)
Unknown words are a challenge for any NLP task, including sentiment analysis. Here, we evaluate the extent to which sentiment polarity of complex words can be predicted based on their morphological make-up. We do this on German as it has very productive processes of derivation and compounding and many German hapax words, which are likely to bear sentiment, are morphologically complex. We present results of supervised classification experiments on new datasets with morphological parses and polarity annotations.
We present a quantitative approach to disambiguating flat morphological analyses and producing more deeply structured analyses. Based on existing morphological segmentations, possible combinations of resulting word trees for the next level are filtered first by criteria of linguistic plausibility and then by weighting procedures based on the geometric mean. The frequencies for weighting are derived from three different sources (counts of morphs in a lexicon, counts of largest constituents in a lexicon, counts of token frequencies in a corpus) and can be used either to find the best analysis on the level of morphs or on the next higher constituent level. The evaluation shows that for this task corpus-based frequency counts are slightly superior to counts of lexical data.
Scales and Scores. An evaluation of methods to determine the intensity of subjective expressions
(2015)
In this contribution, we present a survey of several methods that have been applied to the ordering of various types of subjective expressions (e.g. good < great), in particular adjectives and adverbs. Some of these methods use linguistic regularities that can be observed in large text corpora while others rely on external grounding in metadata, in particular the star ratings associated with product reviews. We discuss why these methods do not work uniformly across all types of expressions. We also present the first application of some of these methods to the intensity ordering of nouns (e.g. moron < dummy).
In recent years, theoretical and computational linguistics has paid much attention to linguistic items that form scales. In NLP, much research has focused on ordering adjectives by intensity (tiny < small). Here, we address the task of automatically ordering English adverbs by their intensifying or diminishing effect on adjectives (e.g. extremely small < very small). We experiment with 4 different methods: 1) using the association strength between adverbs and adjectives; 2) exploiting scalar patterns (such as not only X but Y); 3) using the metadata of product reviews; 4) clustering. The method that performs best is based on the use of metadata and ranks adverbs by their scaling factor relative to unmodified adjectives.
German is a language with complex morphological processes. Its long and often ambiguous word forms present a bottleneck problem in natural language processing. As a step towards morphological analyses of high quality, this paper introduces a morphological treebank for German. It is derived from the linguistic database CELEX which is a standard resource for German morphology. We build on its refurbished, modernized and partially revised version. The derivation of the morphological trees is not trivial, especially for such cases of conversions which are morpho-semantically opaque and merely of diachronic interest. We develop solutions and present exemplary analyses. The resulting database comprises about 40,000 morphological trees of a German base vocabulary whose format and grade of detail can be chosen according to the requirements of the applications. The Perl scripts for the generation of the treebank are publicly available on github. In our discussion, we show some future directions for morphological treebanks. In particular, we aim at the combination with other reliable lexical resources such as GermaNet.