Refine
Year of publication
- 2013 (27) (remove)
Document Type
- Conference Proceeding (27) (remove)
Has Fulltext
- yes (27)
Is part of the Bibliography
- no (27)
Keywords
Publicationstate
Reviewstate
- Peer-Review (10)
- (Verlags)-Lektorat (4)
Publisher
- Association for Computational Linguistics (3)
- Trojina, Institute for Applied Slovene Studies/Eesti Keele Instituut (3)
- Association of Internet Researchers (2)
- Dagstuhl (2)
- International Speech Communications Association (2)
- Universität Hildesheim (2)
- ACM (1)
- Asian Federation of Natural Language Processing (1)
- Bulgarian Academy of Sciences (1)
- Institut für Informationswissenschaft und Sprachtechnologie, Universität Hildesheim (1)
Opinion holder extraction is one of the most important tasks in sentiment analysis. We will briefly outline the importance of predicates for this task and categorize them according to part of speech and according to which semantic role they select for the opinion holder. For many languages there do not exist semantic resources from which such predicates can be easily extracted. Therefore, we present alternative corpus-based methods to gain such predicates automatically, including the usage of prototypical opinion holders, i.e. common nouns, denoting for example experts or analysts, which describe particular groups of people whose profession or occupation is to form and express opinions towards specific items.
We explore the feasibility of contextual healthiness classification of food items. We present a detailed analysis of the linguistic phenomena that need to be taken into consideration for this task based on a specially annotated corpus extracted from web forum entries. For automatic classification, we compare a supervised classifier and rule-based classification. Beyond linguistically motivated features that include sentiment information we also consider the prior healthiness of food items.
We investigate the task of detecting reliable statements about food-health relationships from natural language texts. For that purpose, we created a specially annotated web corpus from forum entries discussing the healthiness of certain food items. We examine a set of task-specific features (mostly) based on linguistic insights that are instrumental in finding utterances that are commonly perceived as reliable. These features are incorporated in a supervised classifier and compared against standard features that are widely used for various tasks in natural language processing, such as bag of words, part-of speech and syntactic parse information.
Interested in formally modelling similarity between narratives, we investigate judgements of similarity between narratives in a small corpus of film reviews and book–film comparisons. A main finding is that judgements tend to concern multiple levels of story representation at once. As these texts are pragmatically related to reception contexts, we find many references to reception quality and optimality. We conclude that current formal models of narrative can not capture the task of naturalistic narrative comparisons given in the analysed reviews, but that the development of models containing a more reception-oriented point of view will be necessary.
The understanding of story variation, whether motivated by cultural currents or other factors, is important for applications of formal models of narrative such as story generation or story retrieval. We present the first stage of an experiment to elicit natural narrative variation data suitable for evaluation with respect to story similarity, to qualitative and quantitative analysis of story variation, and also for data processing. We also present few preliminary results from the first stage of the experiment, using Red Riding Hood and Romeo and Juliet as base texts.
Extending the possibilities for collaborative work with TEI/XML through the usage of a wiki system
(2013)
This paper presents and discusses an integrated project-specific working environment for editing TEI/XML-files and linking entities of interest to a dedicated wiki system. This working environment has been specifically tailored to the workflow in our interdisciplinary digital humanities project GeoBib. It addresses some challenges that arose while working with person-related data and geographical references in a growing collection of TEI/XML-files. While our current solution provides some essential benefits, we also discuss several critical issues and challenges that remain.
With the advent of mobile devices, mediatized political discourse became more dynamic. I assume that the microblog Twitter can be considered as a medium for spatial coordination during protests. Therefore, the case of neo-Nazi demonstrations and counter-protests in the city of Dresden that occurred in February 2012 is analysed. Data consists of microposts that occurred during the event. Quantitative analysis of hashtag and retweet frequencies was performed as well as qualitative speech act pattern analysis and a tempo-spatial discourse analysis on selected subsets of microposts. Results show that a common linguistic practice is verbal georeferencing and by that constructing space. Empirical analysis indicates a strong relation between communicational online space and physical offline place: Protest participants permanently reconfigure spatial context discursively and thus the contested protest area becomes a temporarily meaningful place.
This paper explores on the basis of empirical research, how patterns of interaction and argumentation in political discourse on Twitter evolve as translocal communities in the creative shape of “joint digital storytelling”. Joint storytelling embraces coordinated activities by multiple actors focusing on a shared topic. By adding personal information and evaluation, participants construct an open narrative format, which can be inviting and inspiring for others, who then join in with their own narratives. This model will be exemplified by analyzing a large amount of tweets (107,000) collected during a political conflict between proponents and adversaries of a local traffic project in Germany. Analysis is based on (1) the textual level, (2) the operative level (hashtags, @- and RT-Symbol, hyperlinks etc.) and (3) the visual level of storytelling (embedded photos, videos). Results show a new way of creating translocal online communities and political deliberation.
This paper contributes to the discussion on best practices for the syntactic analysis of non-canonical language, focusing on Twitter microtext. We present an annotation experiment where we test an existing POS tagset, the Stuttgart-Tübingen Tagset (STTS), with respect to its applicability for annotating new text from the social media, in particular from Twitter microblogs. We discuss different tagset extensions proposed in the literature and test our extended tagset on a set of 506 tweets (7.418 tokens) where we achieve an inter-annotator agreement for two human annotators in the range of 92.7 to 94.4 (k). Our error analysis shows that especially the annotation of Twitterspecific phenomena such as hashtags and at-mentions causes disagreements between the human annotators. Following up on this, we provide a discussion of the different uses of the @- and #-marker in Twitter and argue against analysing both on the POS level by means of an at-mention or hashtag label. Instead, we sketch a syntactic analysis which describes these phenomena by means of syntactic categories and grammatical functions.