Refine
Document Type
- Conference Proceeding (2) (remove)
Language
- English (2)
Has Fulltext
- yes (2)
Keywords
- Computerlinguistik (2)
- Automatische Sprachverarbeitung (1)
- Deutsch (1)
- Morphemanalyse (1)
- Natürliche Sprache (1)
- Polarität (1)
- Segmentierung (1)
- Sentimentanalyse (1)
- Text Mining (1)
- Worthäufigkeit (1)
Publicationstate
- Veröffentlichungsversion (2) (remove)
Reviewstate
- Peer-Review (2) (remove)
Publisher
We present a quantitative approach to disambiguating flat morphological analyses and producing more deeply structured analyses. Based on existing morphological segmentations, possible combinations of resulting word trees for the next level are filtered first by criteria of linguistic plausibility and then by weighting procedures based on the geometric mean. The frequencies for weighting are derived from three different sources (counts of morphs in a lexicon, counts of largest constituents in a lexicon, counts of token frequencies in a corpus) and can be used either to find the best analysis on the level of morphs or on the next higher constituent level. The evaluation shows that for this task corpus-based frequency counts are slightly superior to counts of lexical data.
Unknown words are a challenge for any NLP task, including sentiment analysis. Here, we evaluate the extent to which sentiment polarity of complex words can be predicted based on their morphological make-up. We do this on German as it has very productive processes of derivation and compounding and many German hapax words, which are likely to bear sentiment, are morphologically complex. We present results of supervised classification experiments on new datasets with morphological parses and polarity annotations.