Refine
Year of publication
Document Type
- Part of a Book (4500)
- Article (2966)
- Book (996)
- Conference Proceeding (688)
- Part of Periodical (308)
- Review (257)
- Other (151)
- Working Paper (83)
- Doctoral Thesis (68)
- Report (35)
Language
- German (8078)
- English (1765)
- Russian (145)
- French (38)
- Multiple languages (22)
- Spanish (16)
- Portuguese (14)
- Italian (9)
- Polish (7)
- Ukrainian (5)
Keywords
- Deutsch (5140)
- Korpus <Linguistik> (940)
- Wörterbuch (605)
- Konversationsanalyse (451)
- Rezension (423)
- Grammatik (405)
- Rechtschreibung (374)
- Gesprochene Sprache (361)
- Sprachgebrauch (356)
- Interaktion (339)
Publicationstate
- Veröffentlichungsversion (3883)
- Zweitveröffentlichung (1642)
- Postprint (395)
- Preprint (10)
- Erstveröffentlichung (8)
- Ahead of Print (7)
- (Verlags)-Lektorat (4)
- Hybrides Open Access (2)
- Verlags-Lektorat (1)
- Verlagsveröffentlichung (1)
Reviewstate
- (Verlags)-Lektorat (3836)
- Peer-Review (1596)
- Verlags-Lektorat (94)
- Peer-review (56)
- Qualifikationsarbeit (Dissertation, Habilitationsschrift) (44)
- Review-Status-unbekannt (14)
- Peer-Revied (12)
- Abschlussarbeit (Bachelor, Master, Diplom, Magister) (Bachelor, Master, Diss.) (10)
- (Verlags-)Lektorat (9)
- Verlagslektorat (5)
Publisher
- de Gruyter (1334)
- Institut für Deutsche Sprache (1091)
- Schwann (638)
- Narr (484)
- Leibniz-Institut für Deutsche Sprache (IDS) (263)
- De Gruyter (245)
- Niemeyer (200)
- Lang (184)
- Narr Francke Attempto (170)
- IDS-Verlag (144)
In recent years, text classification in sentiment analysis has mostly focused on two types of classification, the distinction between objective and subjective text, i.e. subjectivity detection, and the distinction between positive and negative subjective text, i.e. polarity classification. So far, there has been little work examining the distinction between definite polar subjectivity and indefinite polar subjectivity. While the former are utterances which can be categorized as either positive or negative, the latter cannot be categorized as either of these two categories. This paper presents a small set of domain independent features to detect indefinite polar sentences. The features reflect the linguistic structure underlying these types of utterances. We give evidence for the effectiveness of these features by incorporating them into an unsupervised rule-based classifier for sentence-level analysis and compare its performance with supervised machine learning classifiers, i.e. Support Vector Machines (SVMs) and Nearest Neighbor Classifier (kNN). The data used for the experiments are web-reviews collected from three different domains.
Traditionally, research on language change has been a post-mortem activity, focused on isolated changes that are complete and often only documented in written texts. In the 1960s the field was advanced considerably by Labovian sociolinguistics and the investigation of “change in progress” adduced through patterns of community-internal linguistic variation correlated with external facts about speakers such as age and class (see Labov 1994 for an overview). However, despite the many benefits of such work on “dynamic synchrony,” we still know relatively little about how language change unfolds over the lifetimes of individual speakers, that is, in real time (cf. Bailey et al. 1991). The logistical challenges of such research are, of course, considerable. Whereas it is straightforward for psycholinguists to observe language development in children over the course of a few years, documenting changes in the verbal behavior of individuals over several decades is by contrast much less feasible. Nevertheless, present theoretical models of language change could be considerably improved by the results of real-time studies.
Berichtet wird aus einem Forschungsprojekt des Instituts für deutsche Sprache, Mannheim, das sich zum Ziel gesetzt hat, Sprachwandel in statu nascendi zu beobachten, den Sprecher und die individuellen Veränderungen seines Sprechens und seiner Einstellung zur Sprache nach Ablauf von etwa vier Jahrzehnten in den Blick zu nehmen. Erneut interviewt werden Sprecher deutscher Dialekte oder Umgangssprachen, die in verschiedenen Forschungsprojekten in den 50er und 60er Jahren aufgenommen wurden und von denen eine Tonbandaufnahme im Deutschen Spracharchiv archiviert ist. Im Rahmen einer dem Forschungsprojekt vorgeschalteten inzwischen abgeschlossenen Pilotstudie wurde ein umfängliches methodisches Instrumentarium erprobt, um aussagekräftiges Vergleichsmaterial und Sprachbiographien einiger ausgewählter Sprecher elizitieren zu können. Auf der Basis dieser Studie werden das Projektdesign und die Analysekategorien für die Hauptuntersuchung festgelegt.
In this paper, we describe MLSA, a publicly available multi-layered reference corpus for German-language sentiment analysis. The construction of the corpus is based on the manual annotation of 270 German-language sentences considering three different layers of granularity. The sentence-layer annotation, as the most coarse-grained annotation, focuses on aspects of objectivity, subjectivity and the overall polarity of the respective sentences. Layer 2 is concerned with polarity on the word- and phrase-level, annotating both subjective and factual language. The annotations on Layer 3 focus on the expression-level, denoting frames of private states such as objective and direct speech events. These three layers and their respective annotations are intended to be fully independent of each other. At the same time, exploring for and discovering interactions that may exist between different layers should also be possible. The reliability of the respective annotations was assessed using the average pairwise agreement and Fleiss’ multi-rater measures. We believe that MLSA is a beneficial resource for sentiment analysis research, algorithms and applications that focus on the German language.
We use a convolutional neural network to perform authorship identification on a very homogeneous dataset of scientific publications. In order to investigate the effect of domain biases, we obscure words below a certain frequency threshold, retaining only their POS-tags. This procedure improves test performance due to better generalization on unseen data. Using our method, we are able to predict the authors of scientific publications in the same discipline at levels well above chance.