Refine
Document Type
- Conference Proceeding (24)
- Part of a Book (15)
- Article (4)
- Book (1)
- Part of Periodical (1)
Has Fulltext
- yes (45)
Is part of the Bibliography
- yes (45) (remove)
Keywords
- Deutsch (15)
- Beleidigung (11)
- Computerlinguistik (11)
- Korpus <Linguistik> (10)
- Automatische Sprachanalyse (9)
- Social Media (8)
- Annotation (7)
- Automatische Sprachverarbeitung (7)
- Gesprochene Sprache (7)
- Natürliche Sprache (7)
Publicationstate
Reviewstate
- Peer-Review (41)
- (Verlags)-Lektorat (1)
- Peer-review (1)
Publisher
- The Association for Computational Linguistics (7)
- European Language Resources Association (6)
- Association for Computational Linguistics (5)
- German Society for Computational Linguistics & Language Technology und Friedrich-Alexander-Universität Erlangen-Nürnberg (4)
- European language resources association (ELRA) (3)
- Austrian Academy of Sciences (2)
- Gesellschaft für Sprachtechnologie und Computerlinguistik (2)
- Springer (2)
- Asian Federation of Natural Language Processing (1)
- Austrian academy of sciences (1)
We present a testsuite for POS tagging German web data. Our testsuite provides the original raw text as well as the gold tokenisations and is annotated for parts-of-speech. The testsuite includes a new dataset for German tweets, with a current size of 3,940 tokens. To increase the size of the data, we harmonised the annotations in already existing web corpora, based on the Stuttgart-Tübingen Tag Set. The current version of the corpus has an overall size of 48,344 tokens of web data, around half of it from Twitter. We also present experiments, showing how different experimental setups (training set size, additional out-of-domain training data, self-training) influence the accuracy of the taggers. All resources and models will be made publicly available to the research community.
Alleviating pain is good and abandoning hope is bad. We instinctively understand how words like alleviate and abandon affect the polarity of a phrase, inverting or weakening it. When these words are content words, such as verbs, nouns, and adjectives, we refer to them as polarity shifters. Shifters are a frequent occurrence in human language and an important part of successfully modeling negation in sentiment analysis; yet research on negation modeling has focused almost exclusively on a small handful of closed-class negation words, such as not, no, and without. A major reason for this is that shifters are far more lexically diverse than negation words, but no resources exist to help identify them. We seek to remedy this lack of shifter resources by introducing a large lexicon of polarity shifters that covers English verbs, nouns, and adjectives. Creating the lexicon entirely by hand would be prohibitively expensive. Instead, we develop a bootstrapping approach that combines automatic classification with human verification to ensure the high quality of our lexicon while reducing annotation costs by over 70%. Our approach leverages a number of linguistic insights; while some features are based on textual patterns, others use semantic resources or syntactic relatedness. The created lexicon is evaluated both on a polarity shifter gold standard and on a polarity classification task.
Both for psychology and linguistics, emotion concepts are a continuing challenge for analysis in several respects. In this contribution, we take up the language of emotion as an object of study from several angles. First, we consider how frame semantic analyses of this domain by the FrameNet project have been developing over time, due to theory-internal as well as application-oriented goals, towards ever more fine-grained distinctions and greater within-frame consistency. Second, we compare how FrameNet’s linguistically oriented analysis of lexical items in the emotion domain compares to the analysis by domain experts of the experiences that give rise (directly or indirectly) to the lexical items. And finally, we consider to what extent frame semantic analysis can capture phenomena such as connotation and inference about attitudes, which are important in the field of sentiment analysis and opinion mining, even if they do not involve the direct evocation of emotion.
Journal for language technology and computational linguistics. Special Issue on offensive language
(2020)
Recent years have seen a sharp increase in studies of offensive language (and related notions such as abusive language, hate speech, verbal aggression etc.) as well as of patterns of online behavior such as cyberbullying and trolling. Multiple efforts have been launched for the exploration of computational approaches and the establishment of benchmark datasets for various languages (Basile et al. (2019), Wiegand et al. (2018), Zampieri et al. (2019)).