OPUS 4 | Search

The creoleness of Unserdeutsch (Rabaul Creole German): A typological perspective (2017)

In this paper, we discuss to what extent the German-based contact language Unserdeutsch (Rabaul Creole German, cf. Volker 1982) matches the category‘creole language’ from both a socio-historical and structural perspective. As a point of reference, we will use typological criteria that are widely supposed to be typical for creole languages. It is shown that Unserdeutsch fits fairly well into the pattern of an ‘average creole’, as has been suggested by data in the Atlas of Pidgin and Creole Language Structures (Michaelis et al. 2013). This is despite a series of atypical conditions in its development that might lead us to expect a close structural proximity to the lexifier language, i.e. a relatively acrolectal creole. A possible explanation for this striking discrepancy can be found in the primary function of Unserdeutsch as a marker of identity as well as in the linguistic structure of its substrate language Tok Pisin.

Documenting Unserdeutsch (Rabaul Creole German): A workshop report (2017)

Götze, Angelika ; Lindenfelser, Siegwalt ; Lipfert, Salome ; Neumeier, Katharina ; König, Werner ; Maitz, Péter

This paper provides insights into the ongoing international research project Unserdeutsch (Rabaul Creole German): Documentation of a highly endangered creole language in Papua New Guinea, based at the University of Augsburg, Germany. It elaborates on the different stages of the project, ranging from fieldwork to corpus development, thereby outlining the methods and software background used for the intended purposes. In doing so, we also give some approaches to solving specific problems, which have arisen in the course of practical work until now.

Verbs as linguistic markers of agency: The social side of grammar (2017)

Formanowicz, Magdalena ; Roessel, Janin ; Suitner, Caterina ; Maass, Anne

Basic grammatical categories may carry social meanings irrespective of their semantic content. In a set of four studies, we demonstrate that verbs—a basic linguistic category present and distinguishable in most languages—are related to the perception of agency, a fundamental dimension of social perception. In an archival analysis of actual language use in Polish and German, we found that targets stereotypically associated with high agency (men and young people) are presented in the immediate neighborhood of a verb more often than non-agentic social targets (women and older people). Moreover, in three experiments using a pseudo-word paradigm, verbs (but not adjectives and nouns) were consistently associated with agency (but not with communion). These results provide consistent evidence that verbs, as grammatical vehicles of action, are linguistic markers of agency. In demonstrating meta-semantic effects of language, these studies corroborate the view of language as a social tool and an integral part of social perception.

Practice Report. A blended learning approach to teaching NLP for a DH public (2017)

Faaß, Gertrud ; Heid, Ulrich

This paper reports about current practice in a staged approach to the introduction of NLP principles and techniques for students of information science (IIM) and of international communication and translation (ICT) as part of their curricula. As most of these students are rather not familiar with computer science or, in the case of IIM students, linguistics, we see them as comparable with students of the humanities. We follow a blended learning strategy with lectures, online materials, tutorials, and screencasts. In the first two terms, we focus on linguistics and its formalisation, NLP tools and applications are then introduced from the third term on. The lectures are combined with tutorials and - since the summer term 2017 - with a set of screencasts.

Implementation of a Latin grammar in grammatical framework (2017)

Lange, Herbert

In this paper we present work in developing a computerized grammar for the Latin language. It demonstrates the principles and challenges in developing a grammar for a natural language in a modern grammar formalism. The grammar presented here provides a useful resource for natural language processing applications in different fields. It can be easily adopted for language learning and use in language technology for Cultural Heritage like translation applications or to support post-correction of document digitization.

Semantic author name disambiguation with word embeddings (2017)

Müller, Mark-Christoph

We present a supervised machine learning AND system which tackles semantic similarity between publication titles by means of word embeddings. Word embeddings are integrated as external components, which keeps the model small and efficient, while allowing for easy extensibility and domain adaptation. Initial experiments show that word embeddings can improve the Recall and F score of the binary classification sub-task of AND. Results for the clustering sub-task are less clear, but also promising and overall show the feasibility of the approach.

Data sets for author name disambiguation: an empirical analysis and a new resource (2017)

Müller, Mark-Christoph ; Reitz, Florian ; Roy, Nicolas

Data sets of publication meta data with manually disambiguated author names play an important role in current author name disambiguation (AND) research. We review the most important data sets used so far, and compare their respective advantages and shortcomings. From the results of this review, we derive a set of general requirements to future AND data sets. These include both trivial requirements, like absence of errors and preservation of author order, and more substantial ones, like full disambiguation and adequate representation of publications with a small number of authors and highly variable author names. On the basis of these requirements, we create and make publicly available a new AND data set, SCAD-zbMATH. Both the quantitative analysis of this data set and the results of our initial AND experiments with a naive baseline algorithm show the SCAD-zbMATH data set to be considerably different from existing ones. We consider it a useful new resource that will challenge the state of the art in AND and benefit the AND research community.

Next speakers plan their turn early and speak after turn-final “go-signals” (2017)

Barthel, Mathias ; Meyer, Antje S. ; Levinson, Stephen C.

In conversation, turn-taking is usually fluid, with next speakers taking their turn right after the end of the previous turn. Most, but not all, previous studies show that next speakers start to plan their turn early, if possible already during the incoming turn. The present study makes use of the list-completion paradigm (Barthel et al., 2016), analyzing speech onset latencies and eye-movements of participants in a task-oriented dialogue with a confederate. The measures are used to disentangle the contributions to the timing of turn-taking of early planning of content on the one hand and initiation of articulation as a reaction to the upcoming turn-end on the other hand. Participants named objects visible on their computer screen in response to utterances that did, or did not, contain lexical and prosodic cues to the end of the incoming turn. In the presence of an early lexical cue, participants showed earlier gaze shifts toward the target objects and responded faster than in its absence, whereas the presence of a late intonational cue only led to faster response times and did not affect the timing of participants' eye movements. The results show that with a combination of eye-movement and turn-transition time measures it is possible to tease apart the effects of early planning and response initiation on turn timing. They are consistent with models of turn-taking that assume that next speakers (a) start planning their response as soon as the incoming turn's message can be understood and (b) monitor the incoming turn for cues to turn-completion so as to initiate their response when turn-transition becomes relevant.

Genau als redebeitragsinterne, responsive, sequenzschließende oder sequenzstrukturierende Bestätigungspartikel im Gespräch (2017)

Oloff, Florence

Genau tritt im aktuellen Sprachgebrauch nicht nur in seiner klassischen Bedeutung als Adjektiv oder Adverb auf, sondern wird auch als Fokus- bzw. Gradpartikel sowie Gesprächspartikel verwendet. Bisherige Beschreibungen haben sich nur in geringem Maße und unter Verwendung heterogener Begriffe mit seinem interaktionalen Gebrauch auseinandergesetzt. In diesem Beitrag werden mit Hilfe eines sequenziellen und multimodalen Ansatzes verschiedene interaktionale Verwendungen von genau in Videoaufnahmen deutscher Alltagsgespräche untersucht. Ausgehend von seiner Funktion als Gradpartikel wird genau sowohl als redebeitragsinterne Bestätigungspartikel in Wortfindungsprozessen als auch als responsive Bestätigungspartikel eingesetzt. Da genau häufig das Ende eines Verstehensprozesses bzw. einer Wissensverhandlung markiert, könnte allgemeiner die Bezeichnung des Intersubjektivitätsmarkers in Erwägung gezogen werden. Aus dem responsiven, bestätigenden Gebrauch heraus entsteht eine stärker sequenzschließende und sequenzstrukturierende Funktion von genau, woraus sich auch der zunehmende Gebrauch dieses Lexems als rein diskursstrukturierende Partikel innerhalb eines Redezugs erklären könnte.

CLARIN-D: eine Forschungsinfrastruktur für die sprachbasierte Forschung in den Geistes- und Sozialwissenschaften (2017)

Hinrichs, Erhard ; Trippel, Thorsten

Für die sprachbasierte Forschung in den Geistes- und Sozialwissenschaften stellt CLARIN eine Forschungsinfrastruktur bereit, die auf die hochgradig heterogenen Forschungsdaten in diesen Wissenschaftsbereichen angepasst ist. Mit Werkzeugen zum Auffinden, zur standardkonformen Aufbereitung und zur nachhaltigen Aufbewahrung von Daten sowie mit der Bereitstellung von virtuellen Forschungsumgebungen zur kollaborativen Erstellung und Auswertung von Forschungsdaten unterstützt CLARIN alle wesentlichen Aspekte des Datenmanagements und der Datenarchivierung. Diese CLARIN-Angebote werden durch Beratungs- und Schulungsmaßnahmen begleitet.

The interaction between telicity and agentivity: Experimental evidence from intransitive verbs in German and Chinese (2017)

Graf, Tim ; Philipp, Markus ; Xu, Xiaonan ; Kretzschmar, Franziska ; Primus, Beatrice

Telicity and agentivity are semantic factors that split intransitive verbs into (at least two) different classes. Clear-cut unergative verbs, which select the auxiliary HAVE, are assumed to be atelic and agent-selecting; unequivocally unaccusative verbs, which select the auxiliary BE, are analyzed as telic and patient-selecting. Thus, agentivity and telicity are assumed to be inversely correlated in split intransitivity. We will present semantic and experimental evidence from German and Mandarin Chinese that casts doubts on this widely held assumption. The focus of our experimental investigation lies on variation with respect to agentivity (specifically motion control, manipulated via animacy), telicity (tested via a locative vs. goal adverbial), and BE/HAVE-selection with semantically flexible intransitive verbs of motion. Our experimental methods are acceptability ratings for German and Chinese (Experiments 1 and 2) and event-related potential (ERP) measures for German (Experiment 3). Our findings contradict the above-mentioned assumption that agentivity and telicity are generally inversely correlated and suggest that for the verbs under study, agentivity and telicity harmonize with each other. Furthermore, the ERP measures reveal that the impact of the interaction under discussion is more pronounced on the verb lexeme than on the auxiliary. We also found differences between Chinese and German that relate to the influence of telicity on BE/HAVE-selection. They seem to confirm the claim in previous research that the weight of the telicity factor locomotion (or internal motion) is cross-linguistically variable.

The two sides of prediction error in reading: on the relationship between eye movements and the N400 in sentence processing (2017)

Kretzschmar, Franziska ; Alday, Phillip M.

Taking typography to experimental testing: On the influence of serifs, fonts and justification on eye movements in text reading (2017)

Jarosch, Julian ; Schlesewsky, Matthias ; Füssel, Stephan ; Kretzschmar, Franziska

Typography and individual experience in digital reading: Do readers’ eye movements adapt to poor justification? (2017)

Jarosch, Julian ; Schlesewsky, Matthias ; Füssel, Stephan ; Kretzschmar, Franziska

When readers pay attention to the left: A concurrent eyetracking-fMRI investigation on the neuronal correlates of regressive eye movements during reading (2017)

Weiß, Anna Fiona ; Kretzschmar, Franziska ; Nagels, Arne ; Schlesewsky, Matthias ; Bornkessel-Schlesewsky, Ina ; Tune, Sarah

Beyond verb meaning: experimental evidence for incremental processing of semantic roles and event structure (2017)

Philipp, Markus ; Graf, Tim ; Kretzschmar, Franziska ; Primus, Beatrice

We present an event-related potentials (ERP) study that addresses the question of how pieces of information pertaining to semantic roles and event structure interact with each other and with the verb’s meaning. Specifically, our study investigates German verb-final clauses with verbs of motion such as fliegen ‘fly’ and schweben ‘float, hover,’ which are indeterminate with respect to agentivity and event structure. Agentivity was tested by manipulating the animacy of the subject noun phrase and event structure by selecting a goal adverbial, which makes the event telic, or a locative adverbial, which leads to an atelic reading. On the clause-initial subject, inanimates evoked an N400 effect vis-à-vis animates. On the adverbial phrase in the atelic (locative) condition, inanimates showed an N400 in comparison to animates. The telic (goal) condition exhibited a similar amplitude like the inanimate-atelic condition. Finally, at the verbal lexeme, the inanimate condition elicited an N400 effect against the animate condition in the telic (goal) contexts. In the atelic (locative) condition, items with animates evoked an N400 effect compared to inanimates. The combined set of findings suggest that clause-initial animacy is not sufficient for agent identification in German, which seems to be completed only at the verbal lexeme in our experiment. Here non-agents (inanimates) changing their location in a goal-directed way and agents (animates) lacking this property are dispreferred and this challenges the assumption that change of (locational) state is generally a defining characteristic of the patient role. Besides this main finding that sheds new light on role prototypicality, our data seem to indicate effects that, in our view, are related to complexity, i.e., minimality. Inanimate subjects or goal arguments increase processing costs since they have role or event structure restrictions that animate subjects or locative modifiers lack.

Korpushermeneutische Analyse politischer Reden mittels CorpusExplorer (2017)

Rüdiger, Jan Oliver

Die Idee hinter dem Projekt – einen schnellen und einfachen Einstieg in die Analyse großer Korpusdaten mittels CorpusExplorer geben. Diese frei verfügbare Software bietet aktuell über 45 Analysen/Visualisierungen für vielfältige korpuslinguistische Zwecke und ist durch ihre Nutzerfreundlichkeit auch für den Einsatz in der universitären Lehre geeignet. Als Beispiel dient das EuroParl-Korpus, man kann aber auch eigenes Textmaterial (z. B. Textdateien, eBooks, Xml, Twitter, Blogs, etc.) mit dem CorpusExplorer annotieren, analysieren und visualisieren. Die Videos zeigen Schritt-für-Schritt die einzelnen Funktionen. Überspannt werden die Videos von einer kleinen zweistufigen Aufgabe: Zuerst sollten ein paar Fragen/Thesen/Annahmen überlegt werden, die sich mit den Plenarprotokollen des EuroParl auswerten lassen – einige Videos geben auch explizite Anregungen oder man nutzt die Inspiration der anderen Beiträge im Issue #3. Die einfachsten Fragen/Thesen lassen sich bereits mit den hier vorgestellten Videos beantworten. Sobald es komplexer wird, betritt man den zweiten – reflexiven Teil der überspannenden Aufgabe: Es ist zu überlegen, wie durch (mehrfache) Kombination der einzelnen Video-/Wissensbausteine das Ziel erreicht werden kann (ein Beispiel – siehe Script). Im Zweifelsfall stehen außerdem ein Handbuch und ein E-Mail Support zur Verfügung.

François de Salignac de la Mothe Fénelon – Les Aventures de Télémaque (1699): Macrostruttura e paratesti nelle versioni tedesche del ‘700 (2017)

Flinz, Carolina

This paper analyses the XVIII century German translations of 'Les aventures de Télémaque' (1699) by François de Salignac de la Mothe Fénelon. In that century, Fénelon's masterpiece was translated into German mainly by four authors (August Bohse, Benjamin Neukirch, Josef Anton Ehrenreich, Ludwig Ernst Faramond), who adapted the text according not only to the historical period, but also to their own purpose, creating completely different works. They transformed the original text in different text genres, from a utopian novel with political and pedagogical aims to a text in verse form for didactic purposes, or to an epic poem with pedagogical functions. To investigate the differences between the translations the paper will focus especially on the macrostructural and the paratextual elements in order to make preliminary hypothesis on 1) the text genre, 2) the functions of the text and 3) the expected audience. Examples and final conclusions will end the article.

When appearance does not match accent: neural correlates of ethnicity-related expectancy violations (2017)

Hansen, Karolina ; Steffens, Melanie C. ; Rakić, Tamara ; Wiese, Holger

Most research on ethnicity in neuroscience and social psychology has focused on visual cues. However, accents are central social markers of ethnicity and strongly influence evaluations of others. Here, we examine how varying auditory (vocal accent) and visual (facial appearance) information about others affects neural correlates of ethnicity-related expectancy violations. Participants listened to standard German and Turkish-accented speakers and were subsequently presented with faces whose ethnic appearance was either congruent or incongruent to these voices. We expected that incongruent targets (e.g. German accent/Turkish face) would be paralleled by a more negative N2 event-related brain potential (ERP) component. Results confirmed this, suggesting that incongruence was related to more effortful processing of both Turkish and German target faces. These targets were also subjectively judged as surprising. Additionally, varying lateralization of ERP responses for Turkish and German faces suggests that the underlying neural generators differ, potentially reflecting different emotional reactions to these targets. Behavioral responses showed an effect of violated expectations: German-accented Turkish-looking targets were evaluated as most competent of all targets. We suggest that bringing together neural and behavioral measures of expectancy violations, and using both visual and auditory information, yields a more complete picture of the processes underlying impression formation.

Competent and Warm? How Mismatching Appearance and Accent Influence First Impressions (2017)

Hansen, Karolina ; Rakić, Tamara ; Steffens, Melanie C.

Most research on ethnicity has focused on visual cues. However, accents are strong social cues that can match or contradict visual cues. We examined understudied reactions to people whose one cue suggests one ethnicity, whereas the other cue contradicts it. In an experiment conducted in Germany, job candidates spoke with an accent either congruent or incongruent with their (German or Turkish) appearance. Based on ethnolinguistic identity theory, we predicted that accents would be strong cues for categorization and evaluation. Based on expectancy violations theory we expected that incongruent targets would be evaluated more extremely than congruent targets. Both predictions were confirmed: accents strongly influenced perceptions and Turkish-looking German-accented targets were perceived as most competent of all targets (and additionally most warm). The findings show that bringing together visual and auditory information yields a more complete picture of the processes underlying impression formation.

Marital Satisfaction, Sex, Age, Marriage Duration, Religion, Number of Children, Economic Status, Education, and Collectivistic Values: Data from 33 Countries (2017)

Sorokowski, Piotr ; Randall, Ashley K. ; Groyecka, Agata ; Frackowiak, Tomasz ; Cantarero, Katarzyna ; Hilpert, Peter ; Ahmadi, Khodabakhsh ; Alghraibeh, Ahmad M. ; Aryeetey, Richmond ; Bertoni, Anna ; Bettache, Karim ; Błażejewska, Marta ; Bodenmann, Guy ; Bortolini, Tiago S. ; Bosc, Carla ; Butovskaya, Marina ; Castro, Felipe N. ; Cetinkaya, Hakan ; Cunha, Diana ; David, Daniel ; David, Oana A. ; Domínguez Espinosa, Alejandra C. ; Donato, Silvia ; Dronova, Daria ; Dural, Seda ; Fisher, Maryanne ; Akkaya, Aslıhan Hamamcıoğlu ; Hamamura, Takeshi ; Hansen, Karolina ; Hattori, Wallisen T. ; Hromatko, Ivana ; Gulbetekin, Evrim ; Iafrate, Raffaella ; James, Bawo ; Jiang, Feng ; Kimamo, Charles O. ; Koç, Fırat ; Krasnodębska, Anna ; Laar, Amos ; Lopes, Fívia A. ; Martinez, Rocio ; Mesko, Norbert ; Molodovskaya, Natalya ; Qezeli, Khadijeh Moradi ; Motahari, Zahrasadat ; Natividade, Jean C. ; Ntayi, Joseph ; Ojedokun, Oluyinka ; Omar-Fauzee, Mohd S. B. ; Onyishi, Ike E. ; Özener, Barış ; Paluszak, Anna ; Portugal, Alda ; Realo, Anu ; Relvas, Ana P. ; Rizwan, Muhammad ; Sabiniewicz, Agnieszka L. ; Salkičević, Svjetlana ; Sarmány-Schuller, Ivan ; Stamkou, Eftychia ; Stoyanova, Stanislava ; Šukolová, Denisa ; Sutresna, Nina ; Tadinac, Meri ; Teras, Andero ; Ponciano, Edna L. T. ; Tripathi, Ritu ; Tripathi, Nachiketa ; Tripathi, Mamta ; Yamamoto, Maria E. ; Yoo, Gyesook ; Sorokowska, Agnieszka

Forms of committed relationships, including formal marriage arrangements between men and women, exist in almost every culture (Bell, 1997). Yet, similarly to many other psychological constructs (Henrich et al., 2010), marital satisfaction and its correlates have been investigated almost exclusively in Western countries (e.g., Bradbury et al., 2000). Meanwhile, marital relationships are heavily guided by culturally determined norms, customs, and expectations (for review see Berscheid, 1995; Fiske et al., 1998). While we acknowledge the differences existing both between- and within-cultures, we measured marital satisfaction and several factors that might potentially correlate with it based on self-report data from individuals across 33 countries. The purpose of this paper is to introduce the raw data available for anybody interested in further examining any relations between them and other country-level scores obtained elsewhere. Below, we review the central variables that are likely to be related to marital satisfaction.

Language of Responsibility. The Influence of Linguistic Abstraction on Collective Moral Emotions (2017)

Bilewicz, Michał ; Stefaniak, Anna ; Witkowska, Marta ; Hansen, Karolina

Two experiments investigated the effects of linguistic abstractness on the experience of collective moral emotions. In Experiment 1 participants were presented with two scenarios about ingroup misbehavior, phrased using descriptive action verbs, interpretative action verbs, adjectives or nouns. The results show that participants experienced slightly more negative moral emotions with higher levels of linguistic abstractness. In Experiment 2 we also tested for the influence of national identification on the relationship between linguistic abstractness and emotional reactions. Additionally, we expanded the number of scenarios. Experiment 2 replicated the earlier pattern, but found larger differences between conditions. The strength of national identification did not moderate the observed effects. The results of this research are discussed within the context of the linguistic category model and psychology of collective moral emotions.

Zum Verbalkomplex im Ostpommerschen (2017)

Weber, Thilo

Die kontinental-westgermanischen Sprachen und Dialekte zeichnen sich durch das Vorkommen von mehrteiligen Verbformen in einem satzfinalen Verbalkomplex (im Folgenden VK) aus. Charakteristisch für diesen VK ist sein hohes Maß an Stellungsvariation, wie sie sich bei drei oder mehr Verben bereits innerhalb des Standarddeutschen zeigt (vgl. Duden 2005, 481-482, § 684). Im vorliegenden Beitrag werden Aspekte des VKs im Ostpommerschen untersucht, jenem ostniederdeutschen Dialekt, der bis 1945 östlich der Oder im heutigen Polen gesprochen wurde. Dies geschieht anhand spontansprachlicher Aufnahmen aus der Mitte des 20. Jahrhunderts; der Beitrag ist also als eine sprachhistorische Untersuchung zu verstehen.

Language Independent Named Entity Recognition using Distant Supervision (2017)

Dembowski, Julia ; Wiegand, Michael ; Klakow, Dietrich

While good results have been achieved for named entity recognition (NER) in supervised settings, it remains a problem that for low resource languages and less studied domains little or no labelled data is available. As NER is a crucial preprocessing step for many natural language processing tasks, finding a way to overcome this deficit in data remains of great interest. We propose a distant supervision approach to NER that is both language and domain independent where we automatically generate labelled training data using gazetteers that we previously extracted from Wikipedia. We test our approach on English, German and Estonian data sets and contribute further by introducing several successful methods to reduce the noise in the generated training data. The tested models beat baseline systems and our results show that distant supervision can be a promising approach for NER when no labelled data is available. For the English model we also show that the distant supervision model is better at generalizing within the same domain of news texts by comparing it against a supervised model on a different test set.

Evaluating the Morphological Compositionality of Polarity (2017)

Ruppenhofer, Josef ; Steiner, Petra ; Wiegand, Michael

Unknown words are a challenge for any NLP task, including sentiment analysis. Here, we evaluate the extent to which sentiment polarity of complex words can be predicted based on their morphological make-up. We do this on German as it has very productive processes of derivation and compounding and many German hapax words, which are likely to bear sentiment, are morphologically complex. We present results of supervised classification experiments on new datasets with morphological parses and polarity annotations.

Towards Bootstrapping a Polarity Shifter Lexicon using Linguistic Features (2017)

Schulder, Marc ; Wiegand, Michael ; Ruppenhofer, Josef ; Roth, Benjamin

We present a major step towards the creation of the first high-coverage lexicon of polarity shifters. In this work, we bootstrap a lexicon of verbs by exploiting various linguistic features. Polarity shifters, such as ‘abandon’, are similar to negations (e.g. ‘not’) in that they move the polarity of a phrase towards its inverse, as in ‘abandon all hope’. While there exist lists of negation words, creating comprehensive lists of polarity shifters is far more challenging due to their sheer number. On a sample of manually annotated verbs we examine a variety of linguistic features for this task. Then we build a supervised classifier to increase coverage. We show that this approach drastically reduces the annotation effort while ensuring a high-precision lexicon. We also show that our acquired knowledge of verbal polarity shifters improves phrase-level sentiment analysis.

Authorship attribution with convolutional neural networks and POS-eliding (2017)

Hitschler, Julian ; van den Berg, Esther ; Rehbein, Ines

We use a convolutional neural network to perform authorship identification on a very homogeneous dataset of scientific publications. In order to investigate the effect of domain biases, we obscure words below a certain frequency threshold, retaining only their POS-tags. This procedure improves test performance due to better generalization on unseen data. Using our method, we are able to predict the authors of scientific publications in the same discipline at levels well above chance.

A Survey on Hate Speech Detection using Natural Language Processing (2017)

Schmidt, Anna ; Wiegand, Michael

This paper presents a survey on hate speech detection. Given the steadily growing body of social media content, the amount of online hate speech is also increasing. Due to the massive scale of the web, methods that automatically detect hate speech are required. Our survey describes key areas that have been explored to automatically recognize these types of utterances using natural language processing. We also discuss limits of those approaches.

What do we need to know about an unknown word when parsing German (2017)

Do, Bich-Ngoc ; Rehbein, Ines ; Frank, Anette

We propose a new type of subword embedding designed to provide more information about unknown compounds, a major source for OOV words in German. We present an extrinsic evaluation where we use the compound embeddings as input to a neural dependency parser and compare the results to the ones obtained with other types of embeddings. Our evaluation shows that adding compound embeddings yields a significant improvement of 2% LAS over using word embeddings when no POS information is available. When adding POS embeddings to the input, however, the effect levels out. This suggests that it is not the missing information about the semantics of the unknown words that causes problems for parsing German, but the lack of morphological information for unknown words. To augment our evaluation, we also test the new embeddings in a language modelling task that requires both syntactic and semantic information.

Universal Dependencies are hard to parse – or are they? (2017)

Rehbein, Ines ; Steen, Julius ; Do, Bich-Ngoc ; Frank, Anette

Universal Dependency (UD) annotations, despite their usefulness for cross-lingual tasks and semantic applications, are not optimised for statistical parsing. In the paper, we ask what exactly causes the decrease in parsing accuracy when training a parser on UD-style annotations and whether the effect is similarly strong for all languages. We conduct a series of experiments where we systematically modify individual annotation decisions taken in the UD scheme and show that this results in an increased accuracy for most, but not for all languages. We show that the encoding in the UD scheme, in particular the decision to encode content words as heads, causes an increase in dependency length for nearly all treebanks and an increase in arc direction entropy for many languages, and evaluate the effect this has on parsing accuracy.

Evaluating LSTM models for grammatical function labelling (2017)

Do, Bich-Ngoc ; Rehbein, Ines

To improve grammatical function labelling for German, we augment the labelling component of a neural dependency parser with a decision history. We present different ways to encode the history, using different LSTM architectures, and show that our models yield significant improvements, resulting in a LAS for German that is close to the best result from the SPMRL 2014 shared task (without the reranker).

Detecting annotation noise in automatically labelled data (2017)

Rehbein, Ines ; Ruppenhofer, Josef

We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost. Our method combines an unsupervised generative model with human supervision from active learning. We test our approach on in-domain and out-of-domain data in two languages, in AL simulations and in a real world setting. For all settings, the results show that our method is able to detect annotation errors with high precision and high recall.

Data point selection for genre-aware parsing (2017)

Rehbein, Ines ; Bildhauer, Felix

In the NLP literature, adapting a parser to new text with properties different from the training data is commonly referred to as domain adaptation. In practice, however, the differences between texts from different sources often reflect a mixture of domain and genre properties, and it is by no means clear what impact each of those has on statistical parsing. In this paper, we investigate how differences between articles in a newspaper corpus relate to the concepts of genre and domain and how they influence parsing performance of a transition-based dependency parser. We do this by applying various similarity measures for data point selection and testing their adequacy for creating genre-aware parsing models.

Variantenwörterbuch des Deutschen. Die Standardsprache in Österreich, der Schweiz, Deutschland, Liechtenstein, Luxemburg, Ostbelgien und Südtirol sowie Rumänien, Namibia und Mennonitensiedlungen. 2., völlig neu bearbeitete und erweiterte Auflage. Von den HerausgeberInnen und AutorInnen Ulrich Amm on, Hans Bickel und Alexandra N. Lenz sowie den AutorInnen Juliane Fink, Andreas Gellan, Lorenz Hofer, Karina Schneider-Wiejows ki und Sandra Suter. Unter Mitarbeit von Jakob Ebner, Manfred M. Glau ninger, Andrea Kleene sowie Matej Ďurčo, Sara Hägi, Jörg Klinner, Ioan Lăzărescu, Marie-Anne Morand, Gudrun Salamon, Joachim Steffen, Heidy Suter und Bertold Wöss . Berlin/Boston: De Gruyter (2016). LXXVIII, 916 S. [Rezension] (2017)

Kleiner, Stefan

Claudia Scharioth (2015): Regionales Sprechen und Identität. Eine Studie zum Sprachgebrauch, zu Spracheinstellungen und Identitätskonstruktionen von Frauen in Schleswig-Holstein und Mecklenburg-Vorpommern. Hildesheim: Olms. 374 S. (Deutsche Dialektgeographie. 120) [Rezension] (2017)

Adler, Astrid

Möller, Max: Das Partizip II von Experiencer-Objekt-Verben. Eine korpuslinguistische Untersuchung. – Tübingen: Narr Francke Attempto, 2015. 394 S.; Ill. (Korpuslinguistik und interdisziplinäre Perspektiven auf Sprache; 6) ISBN 978-3-8233-6964-6 [Rezension] (2017)

Schneider, Roman

Kupetz, Maxi: Empathie im Gespräch. Eine interaktionslinguistische Perspektive. – Tübingen: Stauffenburg, 2015. 231 S.; Ill. (Stauffenburg Linguistik; 88) ISBN 978-3-95809-509-0 [Rezension] (2017)

Deppermann, Arnulf

Sprachnormierung und Sprachkritik (Sprachnormenkritik) im Deutschen (2017)

Felder, Ekkehard ; Schwinn, Horst ; Jacob, Katharina

Sprachnormen und Sprachnormierungsprozesse hängen unmittelbar mit Sprachreflexion und Sprachkritik zusammen. Entweder werden Sprachnormen und Sprachnormierungsprozesse linguistisch be- schrieben oder linguistisch / laienlinguistisch bewertet. In der linguistisch begründeten Sprachkritik der 1980er Jahre wird unter dem Paradigma der Sprachnormenkritik der Prozess der Sprachnormierung beobachtet und beschrieben. Sprachnormen und Sprachnormierungsprozesse werden in sprachhistorischer Perspektive aber bereits viel früher in intellektuel- len Kreisen reflektiert und kritisiert. Auch in gegenwärtiger Perspektive sind im laienlinguistischen Bereich Bestrebungen zu verzeichnen, mittels Sprachkritik Einfluss auf Sprachnormen und Sprachnormierungsprozesse zu nehmen. Seit den 2000er Jahren setzen sich wiederum Linguistinnen und Linguisten zum Ziel, Sprachnormen und Sprachnormierung zunächst zu beschreiben und dann nach linguistischen Kriterien zu bewerten. In dem Artikel wird ein Sprachnormenkritikbegriff vertreten, der auf einem Kontinuum von eher Ausdrucksmöglichkeiten abwägenden bis hin zu eindeutig positionsbezogenen Sprachbetrachtungen zu verorten ist, und sowohl die linguistische als auch die laienlinguistische Perspektive mit einbezieht. Unter Sprachnormenkritik wird hier also eine Reflexion der Sprachnormen und Sprachnormierungsprozesse verstanden, in der die Kriterien explizit (eher beschreibend oder eher bewertend) formuliert oder implizit praktiziert werden.

Sprachnormierung und Sprachkritik in europäischer Perspektive (2017)

Felder, Ekkehard ; Schwinn, Horst ; Jacob, Katharina

Der Artikel beschäftigt sich mit einem ganz spezifischen Blick auf Sprachnormen: Ausgehend von der Sprachnormenkritik der Germanistik fokussiert der Artikel die sozio-politischen Implikationen sprachlicher Normfragen. Der Terminus Sprachnormenkritik hat weder im Englischen noch im Französischen oder Italienischen und auch nicht im Kroatischen eine ausdrucksseitige Entsprechung. Das Konzept der ›Sprachnormenkritik‹ bzw. bestimmte Teilkomponenten sind dessen ungeachtet im Englischen, Französischen, Italienischen und Kroatischen seit Jahrhunderten in der Diskussion. Aus vergleichend europäischer Perspektive ist besonders interessant, dass nicht in jedem nationalsprachlichen Diskurs über Sprachnormen der unmittelbare Zusammenhang von sprachlichen Normen einerseits und sozio-ökonomischer Macht bzw. politischer Handlungsfähigkeit andererseits als korrelierende Phänomene diskutiert wird – und genau dies ist der Kern der ursprünglichen Sprachnormenkritik im Deutschen. Besonders eindrücklich lässt sich der politische Charakter der Sprachnormenkritik im Kroatischen demonstrieren. In den 1960er Jahren ist die Sprachnormenkritik im Kroatischen nicht nur eine Kritik, die degressiv erscheinende Zustände aufzudecken versucht, sondern vor allem eine progressive Kritik, die als Vorreiter der politischen Bewegung für die Unabhängigkeit Kroatiens angesehen werden kann.

Einleitung (2017)

Felder, Ekkehard ; Schwinn, Horst ; Busse, Beatrix ; Eichinger, Ludwig M. ; Große, Sybille ; Gvozdanović, Jadranka ; Jacob, Katharina ; Radtke, Edgar

Das Handbuch Europäische Sprachkritik Online (HESO) liefert eine vergleichende Perspektive auf Sprachkritik in europäischen Sprachkulturen. Das Handbuch ist eine periodische und mehrsprachige Online-Publikation. Zu ausgewählten Konzepten der Sprachkritik werden sukzessiv enzyklopädische Artikel veröffentlicht, die ein sprachkritisches Schlüsselkonzept betreffen und die für die europäische Perspektive von kultureller Bedeutung sind. Das Ziel ist demnach, eine Konzeptgeschichte der europäischen Sprachkritik zu präsentieren. Zum einen liefert das Handbuch einen spezifischen Blick auf die jeweiligen Sprachkulturen. Zum anderen werden diese vergleichend in den Blick genommen.

Bericht über die Tagung "Harold Garfinkel's Studies in Ethnomethodology – Fifty Years After", 26.-28. Oktober 2017, Universität Konstanz (2017)

Schmidt, Axel

Harold Garfinkel, Begründer der Ethnomethodologie, wäre dieses Jahr 100 Jahre alt geworden, seine Studies in Ethnomethodology werden 50 Jahre. Grund genug diesen doppelten Geburtstag mit einer Tagung zur "deutschsprachigen Vorge-schichte, Wirkung und Rezeption des Werkes und der Person zu würdigen" (so der Ankündigungstext zur Tagung), die nicht ganz zufällig in Konstanz stattfand, lange Zeit und nach wie vor eine Hochburg rekonstruktiver Sozialforschung (auch) ethnomethodologischer Prägung. Die Tagung Harold Garfinkel's 'Studies in Ethnomethodolgy' – Fifty Years After vom 26.-28.10.2017 an der Universität Konstanz, ausgerichtet vom Lehrstuhl für Allgemeine Soziologie und Kultursoziologie und organisiert von Jörg Bergmann, Christian Meyer und Erhard Schüttpelz, tat dies in einer gebührlichen und beson-deren Weise: Die acht Kapitel der Studies in Ethnomethodology (im Folgenden kurz Studies), ein Konvolut aus Essays und Artikeln, die 1967 erschienen sind, dienten als Grundlage zur Strukturierung der Tagung und als Ausgangspunkt der einzelnen Vorträge.

"das is SO lächerlich; ohne SCHEISS jetz ma" – zur affektiven Äußerungsmodalisierung durch ohne Scheiß-Konstruktionen im gesprochenen Deutsch (2017)

Torres Cajo, Sarah

Der vorliegende Beitrag beschreibt auf der Basis authentischer Alltagsinteraktionen das Formen- und Funktionsspektrum der äußerungsmodalisierenden Kommen-tarphrase ohne Scheiß im gesprochenen Deutsch. Die Konstruktion wird von Inter-agierenden insbesondere als Ressource zur Steigerung des Geltungsanspruchs einer Bezugsäußerung genutzt, wodurch diese als wahr und/oder ernstgemeint modali-siert wird. Damit leistet ohne Scheiß einen wichtigen Beitrag zur Bearbeitung des Erwartungsmanagements durch den/die SprecherIn sowie zur Herstellung von In-tersubjektivität. Die Konstruktion ist syntaktisch variabel und kann somit Äußerun-gen sowohl prospektiv als auch retraktiv modalisieren. Zudem wird mit der Wahl des Lexem Scheiß ein nähesprachliches Register aktiviert, was in Verbindung mit weiteren (prosodischen und/oder lexikalischen) Elementen zu affektiver Aufladung führen kann. Eine abschließende Darstellung häufiger lexikalischer Kookkurrenz-partner und deren funktionaler Bedeutung sowie ein Abgleich zu intrakonstruktio-nalen Varianten wie ohne Witz/ohne Spaß zeigt die Produktivität der Konstruktion im alltäglichen Sprachgebrauch auf.

„Demonstrative“ und „partizipative“ Ritualität: Totensonntagserinnern in einem deutschen und einem russischen Gottesdienst (2017)

Schmitt, Reinhold ; Petrova, Anna

This article explores how close one can come to a cultural-scientific perspective on the basis of a constitution-analytical methodology. We do this on the basis of a comparison of the celebration of Totensonntag in Zotzenbach (Southern Hesse) and Sarepta (Wolgograd). In both places, there are protestant churches that perform this ritual to commemorate the dead on this “Sunday of the Dead” as a part of their church service. Our scientific interest lies in the reconstruction of the rituality produced during the in situ execution. In both services, the names of the deceased are read out and a candle is lit for each deceased person. In Zotzenbach the priest reads out the names and an assistant ignites the candles for the deceased, whereas in Sarepta the bereaved are responsible for this. Since the ritual is organised in very different ways in terms of architecture-for-interaction (statically in Zotzenbach, spatially dynamic in Sarepta), we can reconstruct two completely different models of rituality: a demonstrative one (Zotzenbach) and a participative one (Sarepta). The demonstrative model works on the basis of a finely tuned coordination between the two church representatives and is aimed at a dignified execution. The model in Sarepta is not suitable for the production of formality due to its participatory structure. Here, however, the focus is also on the aspect of socialization, which goes beyond the church service and offers the Russian-German worshipers the opportunity to situationally constitute as a culturally homogeneous group.

Ulrike Haß & Petra Storjohann (Hg.). 2015. Handbuch Wort und Wortschatz (Handbücher Sprachwissen 3). Berlin, Boston: De Gruyter. xii, 531 S. [Rezension] (2017)

Hentschel, Elke

Albert Busch & Thomas Spranz-Fogasy (Hg.). 2015. Handbuch Sprache in der Medizin (Handbücher Sprachwissen 11). Berlin, Boston: De Gruyter Mouton. x, 476 S. [Rezension] (2017)

Schurawitzki, Michael

Merging the trees. Building a morphological treebank for German from two resources (2017)

Steiner, Petra

This paper deals with the creation of the first morphological treebank for German by merging two pre-existing linguistic databases. The first of these is the linguistic database CELEX which is a standard resource for German morphology. We build on its refurbished and modernized version. The second resource is GermaNet, a lexical-semantic network which also provides partial markup for compounds. We describe the state of the art and the essential characteristics of both databases and our latest revisions. As the merging involves two data sources with distinct annotation schemes, the derivation of the morphological trees for the unified resource is not trivial. We discuss how we overcome problems with the data and format, in particular how we deal with overlaps and complementary scopes. The resulting database comprises about 100,000 trees whose format can be chosen according to the requirements of the application at hand. In our discussion, we show some future directions for morphological treebanks. The Perl script for the generation of the data from the sources will be made publicly available on our website.

Heiko Hausendorf, Reinhold Schmitt & Wolfgang Kesselheim (Hg.). 2016. Interaktionsarchitektur, Sozialtopographie und Interaktionsraum (Studien zur deutschen Sprache 72). Tübingen: Narr/Francke/Attempto. 448 S. [Rezension] (2017)

Adamzik, Kirsten

Annette Klosa & Carolin Müller-Spitzer (Hg.). 2016. Internetlexikografie. Ein Kompendium. Berlin, Boston: De Gruyter. xviii, 347 S. [Rezension] (2017)

Frick, Karina

Data point selection for genre-aware parsing (2017)

Rehbein, Ines ; Bildhauer, Felix

In the NLP literature, adapting a parser to new text with properties different from the training data is commonly referred to as domain adaptation. In practice, however, the differences between texts from different sources often reflect a mixture of domain and genre properties, and it is by no means clear what impact each of those has on statistical parsing. In this paper, we investigate how differences between articles in a newspaper corpus relate to the concepts of genre and domain and how they influence parsing performance of a transition-based dependency parser. We do this by applying various similarity measures for data point selection and testing their adequacy for creating genre-aware parsing models.

Das Verhältnis von Pronomen und Determinativen - eine deutsche Spezialität? (2017)

Ballweg, Joachim

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

114 search hits