Refine
Year of publication
Document Type
- Conference Proceeding (44)
- Part of a Book (14)
- Article (6)
Language
- English (64)
Has Fulltext
- yes (64)
Keywords
- Automatische Sprachanalyse (17)
- Deutsch (16)
- Annotation (11)
- Korpus <Linguistik> (11)
- Semantische Analyse (11)
- Computerlinguistik (10)
- Beleidigung (9)
- Frame-Semantik (8)
- Natürliche Sprache (8)
- sentiment analysis (8)
Publicationstate
- Veröffentlichungsversion (54)
- Zweitveröffentlichung (8)
- Postprint (2)
Reviewstate
- Peer-Review (64) (remove)
Publisher
- Association for Computational Linguistics (10)
- European Language Resources Association (9)
- The Association for Computational Linguistics (7)
- German Society for Computational Linguistics & Language Technology und Friedrich-Alexander-Universität Erlangen-Nürnberg (4)
- European Language Resources Association (ELRA) (3)
- European language resources association (ELRA) (3)
- Springer (2)
- Asian Federation of Natural Language Processing (1)
- Austrian Academy of Sciences (1)
- Austrian academy of sciences (1)
Active learning has been applied to different NLP tasks, with the aim of limiting the amount of time and cost for human annotation. Most studies on active learning have only simulated the annotation scenario, using prelabelled gold standard data. We present the first active learning experiment for Word Sense Disambiguation with human annotators in a realistic environment, using fine-grained sense distinctions, and investigate whether AL can reduce annotation cost and boost classifier performance when applied to a real-world task.
A polarity-sensitive item (PSI), as traditionally defined, is an expression that is restricted to either an affirmative or negative context. PSIs like ‘lift a finger’ and ‘all the time in the world’ sub-serve discourse routines like understatement and emphasis. Lexical–semantic classes are increasingly invoked in descriptions of the properties of PSIs. Here, we use English corpus data and the tools of Frame Semantics (Fillmore, 1982, 1985) to explore Israel’s (2011) observation that the semantic role of a PSI determines how the expression fits into a contextually constructed scalar model. We focus on a class of exceptions implied by Israel’s model: cases in which a given PSI displays two countervailing patterns of polarity sensitivity, with attendant differences in scalar entailments. We offer a set of case studies of polaritysensitive expressions – including verbs of attraction and aversion like ‘can live without’, monetary units like ‘a red cent’, comparative adjectives and time-span adverbials – that demonstrate that the interpretation of a given PSI in a given polar context is based on multiple factors. These factors include the speaker’s perspective on and affective stance towards the described event, available inferences about causality and, perhaps most critically, particulars of the predication, including the verb or adjective’s frame membership, the presence or absence of an ability modal like can, the grammatical construction used and the range of contingencies evoked by the utterance.
We present a quantitative approach to disambiguating flat morphological analyses and producing more deeply structured analyses. Based on existing morphological segmentations, possible combinations of resulting word trees for the next level are filtered first by criteria of linguistic plausibility and then by weighting procedures based on the geometric mean. The frequencies for weighting are derived from three different sources (counts of morphs in a lexicon, counts of largest constituents in a lexicon, counts of token frequencies in a corpus) and can be used either to find the best analysis on the level of morphs or on the next higher constituent level. The evaluation shows that for this task corpus-based frequency counts are slightly superior to counts of lexical data.
Accurate opinion mining requires the exact identification of the source and target of an opinion. To evaluate diverse tools, the research community relies on the existence of a gold standard corpus covering this need. Since such a corpus is currently not available for German, the Interest Group on German Sentiment Analysis decided to create such a resource and make it available to the research community in the context of a shared task. In this paper, we describe the selection of textual sources, development of annotation guidelines, and first evaluation results in the creation of a gold standard corpus for the German language.