Refine
Year of publication
- 2011 (2) (remove)
Document Type
- Doctoral Thesis (2) (remove)
Language
- English (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Keywords
- Computerlinguistik (1)
- Empirische Linguistik (1)
- Fokus <Linguistik> (1)
- Information Extraction (1)
- Maschinelles Lernen (1)
- Natürliche Sprache (1)
- Pragmatik (1)
- Text Mining (1)
- computational linguistics (1)
- information extraction (1)
Publicationstate
Reviewstate
Publisher
Sentiment Analysis is the task of extracting and classifying opinionated content in natural language texts. Common subtasks are the distinction between opinionated and factual texts, the classification of polarity in opinionated texts, and the extraction of the participating entities of an opinion(-event), i.e. the source from which an opinion emanates and the target towards which it is directed. With the emerging Web 2.0 which describes the shift towards a highly user-interactive communication medium, the amount of subjective content on the World Wide Web is steadily increasing. Thus, there is a growing need for automatically processing this type of content which is provided by sentiment analysis. Both natural language processing, which is the task of providing computational methods for the analysis and representation of natural language, and machine learning, which is the task of building task-specific classification models on the basis of empirical data, may be instrumental in mastering the challenges of the automatic sentiment analysis of written text. Many problems in sentiment analysis have been proposed to be solved with machine learning methods exclusively using a fairly low-level feature design, such as bag of words, containing little linguistic information. In this thesis, we examine the effectiveness of linguistic features in various subtasks of sentiment analysis. Thus, we heavily draw from the insights gained by natural language processing. The application of linguistic features can be applied on various classification methods, be it in rule-based classification, where the linguistic features are directly encoded as a classifier, in supervised machine learning, where these features complement basic low-level features, or in bootstrapping methods, where these features form a rule-based classifier generating a labeled training set from which a supervised classifier can be trained. In this thesis, we will in particular focus on scenarios where the combination of linguistic features and machine learning methods is effective. We will look at common text classification tasks, both coarse-grained and fine-grained, and extraction tasks.
The study empirically examines the interpretation of focus accents in German. To this end, a methodology is developed, and it is discussed how experimental investigation can proceed at the current state of the focus theory. Methodologically, experiments directly measuring interpretation provide an alternative to the widespread practice of using only empirical preference and production data to investigate the interpretation of stimuli, and it is shown why such an alternative is necessary.
The empirical results show that one must extend and restrict theories assuming an association of free focus and scalar implicature (exhaustivity) or question–answer congruence as follows: On the one hand, situational factors in the interpretation must be taken into account to a greater extent than until now, especially their interaction with ‘physical’ properties of the speech signal (focus marking). On the other hand, a prototypical definition of Focus is called for which connects the major concepts of focus on the phonetic-phonological, semantic and information-structural levels and takes their prototypical coincidence to be the basis of focus interpretation and corresponding intuitions.