OPUS 4 | ja

Proceedings of the 12th edition of the KONVENS conference (2014)

The 2014 issue of KONVENS is even more a forum for exchange: its main topic is the interaction between Computational Linguistics and Information Science, and the synergies such interaction, cooperation and integrated views can produce. This topic at the crossroads of different research traditions which deal with natural language as a container of knowledge, and with methods to extract and manage knowledge that is linguistically represented is close to the heart of many researchers at the Institut für Informationswissenschaft und Sprachtechnologie of Universität Hildesheim: it has long been one of the institute’s research topics, and it has received even more attention over the last few years. The main conference papers deal with this topic from different points of view, involving flat as well as deep representations, automatic methods targeting annotation and hybrid symbolic and statistical processing, as well as new Machine Learning-based approaches, but also the creation of language resources for both machines and humans, and methods for testing the latter to optimize their human-machine interaction properties. In line with the general topic, KONVENS-2014 focuses on areas of research which involve this cooperation of information science and computational linguistics: for example learning-based approaches, (cross-lingual) Information Retrieval, Sentiment Analysis, paraphrasing or dictionary and corpus creation, management and usability.

Workshop Proceedings of the 12th edition of the KONVENS conference (2014)

The 2014 issue of KONVENS is even more a forum for exchange: its main topic is the interaction between Computational Linguistics and Information Science, and the synergies such interaction, cooperation and integrated views can produce. This topic at the crossroads of different research traditions which deal with natural language as a container of knowledge, and with methods to extract and manage knowledge that is linguistically represented is close to the heart of many researchers at the Institut für Informationswissenschaft und Sprachtechnologie of Universität Hildesheim: it has long been one of the institute’s research topics, and it has received even more attention over the last few years.

From <tiger2/> to ISOTiger – community driven developments for syntax annotation in SynAF (2014)

Bosch, Sonja ; Eckart, Kerstin ; Faaß, Gertrud ; Heid, Ulrich ; Lee, Kiyong ; Pareja-Lora, Antonio ; Pretorius, Laurette ; Romary, Laurent ; Witt, Andreas ; Zeldes, Amir ; Zipser, Florian

In 2010, ISO published a standard for syntactic annotation, ISO 24615:2010 (SynAF). Back then, the document specified a comprehensive reference model for the representation of syntactic annotations, but no accompanying XML serialisation. ISO’s subcommittee on language resource management (ISO TC 37/SC 4) is working on making the SynAF serialisation ISOTiger an additional part of the standard. This contribution addresses the current state of development of ISOTiger, along with a number of open issues on which we are seeking community feedback in order to ensure that ISOTiger becomes a useful extension to the SynAF reference model.

Communication of stereotypes in the classroom: biased language use of German and Turkish adolescents (2014)

Schoel, Christiane ; Roessel, Janin ; Jacobsen, Anja ; Stahlberg, Dagmar

Little is known about the linguistic transmission and maintenance of mutual stereotypes in interethnic contexts. This field study, therefore, investigated the linguistic expectancy bias (LEB) and the linguistic intergroup bias (LIB) among German and Turkish adolescents (13 to 20 years) in the school context. The LEB refers to the general phenomenon of describing stereotypes more abstractly. The LIB is the tendency to use language abstraction for in-group protective reasons. Results revealed an unmoderated LEB, whereas the LIB only occurred when foreigners were in the numerical majority, the classroom composition was perceived as a learning disadvantage, or the interethnic conflict frequency was high. These findings provide first evidence for the use of both LEB and LIB in an interethnic classroom setting.

A general lexicographic model for a typological variety of dictionaries in African languages (2014)

Faaß, Gertrud ; Bosch, Sonja E. ; Gouws, Rufus H.

So far, there have been few descriptions on creating structures capable of storing lexicographic data, ISO 24613:2008 being one of the latest. Another one is by Spohr (2012), who designs a multifunctional lexical resource which is able to store data of different types of dictionaries in a user-oriented way. Technically, his design is based on the principle of a hierarchical XML/OWL (eXtensible Markup Language/Web Ontology Language) representation model. This article follows another route in describing a model based on entities and relations between them; MySQL (usually referred to as: Structured Query Language) describes a database system of tables containing data and definitions of relations between them. The model was developed in the context of the project "Scientific eLexicography for Africa" and the lexicographic database to be built thereof will be implemented with MySQL. The principles of the ISO model and of Spohr's model are adhered to with one major difference in the implementation strategy: we do not place the lemma in the centre of attention, but the sense description — all other elements, including the lemma, depend on the sense description. This article also describes the contained lexicographic data sets and how they have been collected from different sources. As our aim is to compile several prototypical internet dictionaries (a monolingual Northern Sotho dictionary, a bilingual learners' Xhosa–English dictionary and a bilingual Zulu–English dictionary), we describe the necessary microstructural elements for each of them and which principles we adhere to when designing different ways of accessing them. We plan to make the model and the (empty) database with all graphical user interfaces that have been developed, freely available by mid-2015.

Towards an integrated E-Dictionary application – The case of an English to Zulu dictionary of possessives (2014)

Faaß, Gertrud ; Bosch, Sonja

This paper describes a first version of an integrated e-dictionary translating possessive constructions from English to Zulu. Zulu possessive constructions are difficult to learn for non-mother tongue speakers. When translating from English into Zulu, a speaker needs to be acquainted with the nominal classification of nouns indicating possession and possessor. Furthermore, (s)he needs to be informed about the morpho-syntactic rules associated with certain combinations of noun classes. Lastly, knowledge of morpho-phonetic changes is also required, because these influence the orthography of the output word forms. Our approach is a novel one in that we combine e-lexicography and natural language processing by developing a (web) interface supporting learners, as well as other users of the dictionary to produce Zulu possessive constructions. The final dictionary that we intend to develop will contain several thousand nouns which users can combine as they wish. It will also translate single words and frequently used multiword expressions, and allow users to test their own translations. On request, information about the morpho-syntactic and morpho-phonetic rules applied by the system are displayed together with the translation. Our approach follows the function theory: the dictionary supports users in text production, at the same time fulfilling a cognitive function.

The 3-Circle-Model of English world-wide: Can it contribute to understanding the global position of German? (2014)

Marten, Heiko F.

This paper seeks to apply the principles of the famous 3-Circle-Model devised for the description of the ecolinguistic position of English world-wide to the position of German around the world. On the one hand, the 3-Circle-Model for English with its "Inner", "Outer" and "Extended/Expanding" Circles was invented by Kachru in the 1980s and has since then been adopted, refined and criticised by numerous authors. The situation of German world-wide, on the other hand, has only been scarcely discussed in the past 20 years. While the global extension of German is obviously by far weaker than that of English, there are also a number of noteworthy similarities in terms of historical spread and the current position of these two languages. This paper therefore discusses the analogies of global English and German by establishing three circles for German: the Inner Circle for the core German-speaking area, i.e. Germany, Austria and Switzerland; the Outer Circle including a number of German minority areas (mostly in Europe), and finally the Extended Circle which may be denoted as "Crumbling" rather than "Expanding". The latter comprises traditional German diaspora communities in different parts of the world which either result from migration, but also reflect the previous functions of German as a language of culture and as a lingua franca in regions like Eastern Europe. The paper argues that there are some striking structural similarities, but also shows the limits of this comparison.

Nachfeldbesetzung und diskursive Strategien der Hervorhebung (2014)

Antonioli, Giorgio

Post-field syntax and focalization strategies in National Socialist political speech. This paper deals with a syntactic feature of spoken German, i.e. post-field filling, and with its occurrence in one specific discourse type – political speech – throughout one significant period of the history of German language – National Socialism. This paper aims at pointing out the communicative pragmatic function of right dislocation in the NS political speech on the basis of some collected examples.

L’évaluation des complétions collaboratives : analyse séquentielle et multimodale de tours de parole co-construits (2014)

Oloff, Florence

Alors que de nombreuses études en analyse conversationnelle se sont intéressées à la manière dont des locuteurs co-construisent un tour de parole (notamment sur le plan syntaxique et prosodique), la façon dont la co-construction est ensuite évaluée n'a pas encore été étudiée en profondeur au sein de la littérature interactionniste. Ici, nous étudions deux pratiques permettant à un locuteur de valider une co-construction, à savoir l'acquiescement simple et l'hétéro-répétition de la complétion. En menant une analyse séquentielle et multimodale de plusieurs séquences de co-construction en français, nous montrons qu’à travers ces deux procédés – qui semblent au premier abord similaires dans leur fonctionnement – les locuteurs effectuent une évaluation très différente : tandis que l'acquiescement simple valide la complétion proposée uniquement comme une version possible, l'hétéro-répétition la valide comme étant une complétion complètement adéquate. Cette contribution met en évidence que les interactants exploitent des ressources audibles aussi bien que visibles afin de manifester si et dans quel sens ils acceptent la complétion de leur tour de parole de la part d’un coparticipant. Nous soulignons l’importance d’étudier en détail les différents formatages possibles des tours évaluant une complétion afin de pouvoir distinguer différentes formes « d’acceptation » et de révéler la manière dont les locuteurs peuvent finement négocier leur position en tant que (co-)auteur ou destinataire d’un tour de parole.

Analyse multimodale de complétions différées suite à des interventions collaboratives (2014)

Oloff, Florence

Cette contribution s’intéresse aux co-constructions d’un tour de parole en interaction, plus spécifiquement, à la manière dont la complétion d’un énoncé de la part d’un co-participant est ensuite réceptionnée par le locuteur dont le tour a été complété. Malgré l’intérêt certain porté par l’analyse conversationnelle et la linguistique interactionnelle à la co-énonciation, l’évaluation de cette pratique par le premier locuteur n’a pas fait l’objet d’analyses approfondies. Dans ce qui suit, nous nous focalisons plus particulièrement sur les pratiques interactionnelles qui permettent aux participants de valider une co-construction. Ce travail est issu du projet ANR SPIM (« L’imitation dans la parole »), dans le cadre duquel nous nous sommes interrogée sur la fonction de l’hétéro-répétition (le fait de répéter un énoncé d’un autre locuteur ou une partie de celui-ci, opposée à l’auto- répétition) dans des séquences de co-construction d’un tour de parole.

Towards automatic quality assessment of component metadata (2014)

Trippel, Thorsten ; Broeder, Daan ; Durco, Matej ; Ohren, Oddrun

Measuring the quality of metadata is only possible by assessing the quality of the underlying schema and the metadata instance. We propose some factors that are measurable automatically for metadata according to the CMD framework, taking into account the variability of schemas that can be defined in this framework. The factors include among others the number of elements, the (re-)use of reusable components, the number of filled in elements. The resulting score can serve as an indicator of the overall quality of the CMD instance, used for feedback to metadata providers or to provide an overview of the overall quality of metadata within a repository. The score is independent of specific schemas and generalizable. An overall assessment of harvested metadata is provided in form of statistical summaries and the distribution, based on a corpus of harvested metadata. The score is implemented in XQuery and can be used in tools, editors and repositories.

Lesen auf neuen Medien: Eine empirische Perspektive (2014)

Kretzschmar, Franziska ; Schlesewsky, Matthias

In recent years, reading has become an increasingly digital experience. In addition to various subjective impressions about the quality of reading from digital media, e.g. that it is more effortful than reading conventional books, a number of more scientiﬁc questions arise at the interface of reading research and book studies. Here, we summarize several new insights on reading effort and reading behavior on digital media. Part one reviews a study in which young and elderly adults read short texts on three different reading devices: a paper page, an e-reader and a tablet computer and answered comprehension questions about them while their eye movements and EEG were recorded. Older adults showed faster mean ﬁxation durations and lower EEG theta band voltage density – known to covary with memory encoding and retrieval – when reading from a tablet computer in comparison to the other devices. Young adults showed comparable ﬁxation durations and theta activity for all three devices. These results can be explained by better text discriminability (higher contrast) of the tablet computer. Older readers may beneﬁt from this enhanced contrast because contrast sensitivity decreases with age. In the second part, we present an explorative study about the inﬂuence of font type and typographic alignment (ﬂush left vs. justiﬁed) on reading from a tablet computer. Importantly, the eyes do not fall between – increasingly larger – spaces, as expected, but – to the contrary – use these spaces for planning an optimal ﬁxation of the next word. In summary, the perspective presented here provides initial evidence about the fruitfulness of interdisciplinary research between experimental reading, neurocognition and book studies.

Endungsvariation (2014)

Konopka, Marek

Über semantische Konsistenzbedingungen deutscher Matrixprädikate. Teil 2 (2014)

Schwabe, Kerstin ; Fittler, Robert

Previous accounts addressing the question what semantic properties of a matrix predicate determine the possible clause type of the embedded clause have not provided a general answer (e.g. Grimshaw 1979, Zifonun et al. 1997, Ginzburg & Sag 2000). This paper proposes that clause-embedding predicates fulfill characteristic logical conditions, so-called consistency conditions, which rule the syntactic potential of the matrix clause: for instance, the clause type of the embedded clause (declarative, ob- and/or wh-interrogative) and the correlate type, the matrix predicate can co-occur with (es and/or ProPP). Furthermore, they predict the logical forms of legitimate constructions with embedded ob- or wh-interrogatives, respectively, and how a legitimate optional correlate modifies the meaning of the matrix predicate.

Über semantische Konsistenzbedingungen deutscher Matrixprädikate. Teil 1 (2014)

Schwabe, Kerstin ; Fittler, Robert

Previous accounts addressing the question what semantic properties of a matrix predicate determine the possible clause type of the embedded clause have not provided a general answer (e.g. Grimshaw 1979, Zifonun et al. 1997, Ginzburg & Sag 2000). This paper proposes that clause-embedding predicates fulfill characteristic logical conditions, so-called consistency conditions, which rule the syntactic potential of the matrix clause: for instance, the clause type of the embedded clause (declarative, ob- and/or wh-interrogative) and the correlate type, the matrix predicate can co-occur with (es and/or ProPP). Furthermore, they predict the logical forms of legitimate constructions with embedded ob- or wh-interrogatives, respectively, and how a legitimate optional correlate modifies the meaning of the matrix predicate.

Vernetzung statt Vereinheitlichung. Digitale Forschungsinfrastrukturen in den Geisteswissenschaften (2014)

Hedeland, Hanna ; Jettka, Daniel ; Lehmberg, Timm

Die Entwicklung der digitalen Infrastruktur am Hamburger Zentrum für Sprachkorpora (HZSK) kann als Beispiel für die Evolution individueller technischer Einzellösungen hin zu fachspezifischen virtuellen Arbeits- und Forschungsumgebungen, die im Rahmen supranationaler Forschungsinfrastrukturen für die digitalen Geisteswissenschaften miteinander vernetzt sind, angesehen werden. Im Fokus steht im konkreten Fall des HZSK die Sicherung der langfristigen Zugänglichkeit von Forschungsdaten (multimedialen Daten gesprochener Sprache) durch die Entwicklung einer virtuellen Forschungsumgebung, die einerseits an die zentrenbasierte Forschungsinfrastruktur CLARIN-D angebunden ist und andererseits fachspezifische Benutzerschnittstellen schafft.

Das 50-jährige IDS (2014)

Bassola, Péter

Eine Außenperspektive auf das Institut für Deutsche Sprache (2014)

Tiittula, Liisa

Impressionen über das IDS anlässlich seines 50-jährigen Bestehens (2014)

Watanabe, Manabu

Begegnungen mit dem Institut für Deutsche Sprache (2014)

Durrell, Martin

Institut für Deutsche Sprache. Ein zuverlässiger Name - eine zuverlässige Einrichtung (2014)

Foschi Albert, Marina

Aller Anfang ist schwer. Meine ersten Begegnungen mit dem IDS (2014)

Wiesinger, Peter

Verbindung, Freundschaft und Kooperation (2014)

Dalmas, Martine

Grüße aus Barcelona an das IDS (2014)

Siguan, Marisa

Von Mannheim bis in die USA: Eine persönliche Verbindung mit dem Institut für Deutsche Sprache (2014)

Lovik, Thomas

Vom Wissen, Können, Tun und Zurückbleiben am IDS (2014)

Cosma, Ruxandra

Fünfzig Jahre IDS - ein Anlass zur besonderen Freude und Anerkennung (2014)

Djordjević, Miloje

Das Institut für Deutsche Sprache als Ort wissenschaftlichen Austausches und vielfältiger Unterstützung externer Forschung. Ein Rückblick aus norwegischer Sicht (2014)

Leirbukt, Oddleif

Persönliche Erfahrungen, gesammelt am Institut für Deutsche Sprache (IDS) in Mannheim (2014)

Balcı, Yasemin

Die Öffentlichkeitsarbeit am Institut für Deutsche Sprache (2014)

Biere, Bernd Ulrich

Das IDS und die Dudenredaktion - mehr als eine gute Nachbarschaft (2014)

Scholze-Stubenrecht, Werner

Grundstrukturen der deutschen Sprache: Eine Zusammenarbeit zwischen dem Goethe-Institut und dem Institut für Deutsche Sprache (2014)

Götze, Lutz

Geschichte der Gremien des IDS (2014)

Löffler, Heinrich

Das IDS von außen: Betrachtungen eines Stammgasts (2014)

Cirko, Lesław

TOURLEX: erste Bausteine für ein deutsch-italienisches Lexikon der Touristik-Fachsprache (2014)

Flinz, Carolina

Die schönen alten Formen ... Grammatischer Wandel der deutschen Verbalflexion - Verfall oder Reorganisation? (2014)

Dammel, Antje

Betrachtet man "Verfallserscheinungen" des Verbalsystems wie Übergänge stark > schwach, so zeigt sich, dass hier weder Rezenz noch Verfall zu konstatieren ist. Mit diachroner und analytischer Tiefe offenbart sich ein gestaffelter, systematischer Komplexitätsabbau, der seine Hochphase im Frühneuhochdeutschen hat und sich schlecht mit der Passivität und Chaos implizierenden Verfallsmetapher verträgt: Reorganisation statt Dekadenz. Entwicklungen wie der präteritale Numerusausgleich ('ich sang' – 'wir sungen' > 'ich sang' – 'wir sangen') oder die Herausbildung der vereinfachten Ablautalternanz X–o–o sind nie nur Komplexitätsreduktion, sondern immer auch Systematisierung; sie bremsen Verfall. Dabei ist der Gewinn an Systematik i.d.R. nicht Normautoritäten geschuldet, sondern ihm liegen sprachsystematische, kognitive und frequenzielle Faktoren zugrunde.

Wörterbuchbenutzung: Ergebnisse einer Umfrage bei italienischen DaF-Lernern (2014)

Flinz, Carolina

Die vorliegende empirische Untersuchung befasst sich mit einer Umfrage zur Wörterbuchbenutzung bei 41 Studentinnen und Studenten des Dipartimento di Filologia, Letteratura e Linguistica der Universität Pisa, dasselbe Department, an dem auch das deutsch-italienische sprachwissenschaftliche Online-Wörterbuch DIL erarbeitet worden ist (vgl. Flinz: 2011). Die schriftliche Umfrage wurde in Anlehnung an Hartmanns 5. Hypothese „An analysis of users´ needs should precede dictionary design“ (1989) durchgeführt. Die wichtigsten Ergebnisse waren von großer Bedeutung für die Gestaltung der makro- und mikrostrukturellen Eigenschaften des Fachwörterbuches. Die Ergebnisse der Untersuchung und die daraus folgenden Reflektionen werden in thematischen Kernblöcken vorgestellt.

Mehrsprachigkeit: ein Überblick. Konsequenzen für den DaF-Unterricht (2014)

Flinz, Carolina

Plurilingualism is an important and widespread term. There are many definitions of the concept and its related words, and these definitions sometimes overlap and cause confusion. The European Union has evidenced plurilingualism since the Treaties of Maastricht and Amsterdam, and its influence on the teaching of foreign languages – especially German – remains considerable. This article aims to provide an explicit, concrete definition of the term, analysing it in lexicographic products, official EU documents and specific literature. The article will conclude with a review of didactic strategies for increasing this complex competence.

Quantitative and Qualitative Research across Cultures and Languages: Cultural Metrics and their Application (2014)

Wagner, Wolfgang ; Hansen, Karolina ; Kronberger, Nicole

Growing globalisation of the world draws attention to cultural differences between people from different countries or from different cultures within the countries. Notwithstanding the diversity of people’s worldviews, current cross-cultural research still faces the challenge of how to avoid ethnocentrism; comparing Western-driven phenomena with like variables across countries without checking their conceptual equivalence clearly is highly problematic. In the present article we argue that simple comparison of measurements (in the quantitative domain) or of semantic interpretations (in the qualitative domain) across cultures easily leads to inadequate results. Questionnaire items or text produced in interviews or via open-ended questions have culturally laden meanings and cannot be mapped onto the same semantic metric. We call the culture-specific space and relationship between variables or meanings a ’cultural metric’, that is a set of notions that are inter-related and that mutually specify each other’s meaning. We illustrate the problems and their possible solutions with examples from quantitative and qualitative research. The suggested methods allow to respect the semantic space of notions in cultures and language groups and the resulting similarities or differences between cultures can be better understood and interpreted.

When Actions Speak Louder Than Words: Preventing Discrimination of Nonstandard Speakers (2014)

Hansen, Karolina ; Rakić, Tamara ; Steffens, Melanie C.

Prejudice against a social group may lead to discrimination of members of this group. One very strong cue of group membership is a (non)standard accent in speech. Surprisingly, hardly any interventions against accent-based discrimination have been tested. In the current article, we introduce an intervention in which what participants experience themselves unobtrusively changes their evaluations of others. In the present experiment, participants in the experimental condition talked to a confederate in a foreign language before the experiment, whereas those in the control condition received no treatment. Replicating previous research, participants in the control condition discriminated against Turkish-accented job candidates. In contrast, those in the experimental condition evaluated Turkish- and standard-accented candidates as similarly competent. We discuss potential mediating and moderating factors of this effect.

Tylko głupcy uśmiechają się do obcych? Analiza różnic kulturowych w postrzeganiu społecznym inteligencji i szczerości osób uśmiechniętych (2014)

Kryś, Kuba ; Hansen, Karolina

Badania nad postrzeganiem społecznym wskazują, że osoby uśmiechające się są na licznych wymiarach postrzegane korzystniej aniżeli osoby nieuśmiechające się. Jednakże w niniejszych badaniach twierdzimy, że ta zależność nie zawsze jest pozytywna ponieważ postrzeganie uśmiechu może być zależne od kultury i takich jej wymiarów jak indywidualizm-kolektywizm czy asertywność. Eksperyment przeprowadzony w sześciu krajach (w Polsce, Niemczech, Norwegii, Iranie, USA oraz RPA) pokazał, że osoby uśmiechające się mogą być w kulturach kolektywistycznych i mało asertywnych postrzegane mniej korzystnie od osób nieuśmiechających się. W Niemczech osoby uśmiechnięte zostały ocenione jako bardziej inteligentne, a w Iranie jako mniej inteligentne niż osoby nieuśmiechnięte. Ponadto we wszystkich krajach poza Iranem osoby uśmiechnięte były postrzegane jako bardziej szczere niż osoby nieuśmiechnięte. Dyskutujemy stwierdzone efekty w kontekście zróżnicowania kultur opisanego przez Housea i zespół (2004) oraz przez Hofstedego (2001).

Backlash Over Gender-Fair Language: The Impact of Feminine Job Titles on Men’s and Women’s Perception of Women (2014)

Budziszewska, Magdalena ; Hansen, Karolina ; Bilewicz, Michał

Feminine forms of job titles raise great interest in many countries. However, it is still unknown how they shape stereotypical impressions on warmth and competence dimensions among female and male listeners. In an experiment with fictitious job titles men perceived women described with feminine job titles as significantly less warm and marginally less competent than women with masculine job titles, which led to lower willingness to employ them. No such effects were observed among women.

Zum Verbalkomplex im Märkisch-Brandenburgischen (2014)

Weber, Thilo

Eine syntaktische Besonderheit der kontinentalwestgermanischen Sprachen ist die Bildung satzfinaler Verbalkomplexe (" ... dass sie das Buch gelesen haben muss"), für die ein hohes Maß an sprach- bzw. dialektübergreifender und idiolektaler Verbstellungsvariation charakteristisch ist. Der niederdeutsche Verbalkomplex gilt in Überblicksdarstellungen als streng kopffinal, wobei bisher – anders als für niederländische und hochdeutsche (besonders: oberdeutsche) Mundarten – kaum empirische Studien vorliegen. Der Aufsatz präsentiert eine deskriptive Analyse des zweigliedrigen Verbalkomplexes im Märkisch-Brandenburgischen, dem südöstlichsten der niederdeutschen Dialektverbände. Im Gegensatz zum Standarddeutschen und anderen niederdeutschen Mundarten wie dem Nordniederdeutschen, weist das Brandenburgische selbst bei nur zwei verbalen Elementen in der rechten Satzklammer Variation auf ("dass sie lesen kann/kann lesen"). Anhand von Tonaufnahmen aus dem bisher kaum erschlossenen DDR-Korpus wird folgenden Fragen nachgegangen: Welche Verbstellungsvarianten sind in welchen Syntagmen möglich bzw. werden präferiert? Welche Unterschiede bestehen zwischen Haupt- und Nebensatzkomplexen? Wie verhält sich der brandenburgische Verbalkomplex in Bezug auf nicht-verbale Intervenierer (sog. Verb Projection Raising)? Wie verhalten sich Modal- und andere infinitivregierende Verben unter Perfekteinbettung (d.h. in stddt. Ersatzinfinitivkontexten)? Am Ende steht eine erste typologische Einordnung des brandenburgischen Verbalkomplexes im Vergleich mit anderen kontinentalwestgermanischen Varietäten, wobei sich areallinguistisch interessante Ähnlichkeiten mit dem südlich angrenzenden Ostmitteldeutschen zeigen.

Ethnografische Dialoganalyse (2014)

Kallmeyer, Werner

Event Mappings for Comparing Frameworks for Narratives (2014)

Fisseni, Bernhard ; Löwe, Benedikt

We present a technique called event mapping that allows to project text representations into event lists, produce an event table, and derive quantitative conclusions to compare the text representations. The main application of the technique is the case where two classes of text representations have been collected in two different settings (e.g., as annotations in two different formal frameworks) and we can compare the two classes with respect to their systematic differences in the event table. We illustrate how the technique works by applying it to data collected in two experiments (one using annotations in Vladimir Propp’s framework, the other using natural language summaries).

Separating Brands from Types: an Investigation of Different Features for the Food Domain (2014)

Wiegand, Michael ; Klakow, Dietrich

We examine the task of separating types from brands in the food domain. Framing the problem as a ranking task, we convert simple textual features extracted from a domain-specific corpus into a ranker without the need of labeled training data. Such method should rank brands (e.g. sprite) higher than types (e.g. lemonade). Apart from that, we also exploit knowledge induced by semi-supervised graph-based clustering for two different purposes. On the one hand, we produce an auxiliary categorization of food items according to the Food Guide Pyramid, and assume that a food item is a type when it belongs to a category unlikely to contain brands. On the other hand, we directly model the task of brand detection using seeds provided by the output of the textual ranking features. We also harness Wikipedia articles as an additional knowledge source.

Automatic Food Categorization from Large Unlabeled Corpora and Its Impact on Relation Extraction (2014)

Wiegand, Michael ; Roth, Benjamin ; Klakow, Dietrich

We present a weakly-supervised induction method to assign semantic information to food items. We consider two tasks of categorizations being food-type classification and the distinction of whether a food item is composite or not. The categorizations are induced by a graph-based algorithm applied on a large unlabeled domain-specific corpus. We show that the usage of a domain-specific corpus is vital. We do not only outperform a manually designed open-domain ontology but also prove the usefulness of these categorizations in relation extraction, outperforming state-of-the-art features that include syntactic information and Brown clustering.

Saarland University’s Participation in the GErman SenTiment AnaLysis shared Task (GESTALT) (2014)

Wiegand, Michael ; Bocionek, Christine ; Conrad, Andreas ; Dembowski, Julia ; Giesen, Jörn ; Linn, Gregor ; Schmeling, Lennart

We report on the two systems we built for Task 1 of the German Sentiment Analysis Shared Task, the task on Source, Subjective Expression and Target Extraction from Political Speeches (STEPS). The first system is a rule-based system relying on a predicate lexicon specifying extraction rules for verbs, nouns and adjectives, while the second is a translation-based system that has been obtained with the help of the (English) MPQA corpus.

IGGSA Shared Tasks on German Sentiment Analysis (GESTALT) (2014)

Ruppenhofer, Josef ; Klinger, Roman ; Struß, Julia Maria ; Sonntag, Jonathan ; Wiegand, Michael

We present the German Sentiment Analysis Shared Task (GESTALT) which consists of two main tasks: Source, Subjective Expression and Target Extraction from Political Speeches (STEPS) and Subjective Phrase and Aspect Extraction from Product Reviews (StAR). Both tasks focused on fine-grained sentiment analysis, extracting aspects and targets with their associated subjective expressions in the German language. STEPS focused on political discussions from a corpus of speeches in the Swiss parliament. StAR fostered the analysis of product reviews as they are available from the website Amazon.de. Each shared task led to one participating submission, providing baselines for future editions of this task and highlighting specific challenges. The shared task homepage can be found at https://sites.google.com/site/iggsasharedtask/.

Annotating with Propp’s Morphology of the Folktale: reproducibility and trainability (2014)

Fisseni, Bernhard ; Kurji, Aadil ; Löwe, Benedikt

We continue the study of the reproducibility of Propp’s annotations from Bod et al. (2012). We present four experiments in which test subjects were taught Propp’s annotation system; we conclude that Propp’s system needs a significant amount of training, but that with sufficient time investment, it can be reliably trained for simple tales.

POS tagset refinement for linguistic analysis and the impact on statistical parsing (2014)

Rehbein, Ines ; Hirschmann, Hagen

The annotation of parts of speech (POS) in linguistically annotated corpora is a fundamental annotation layer which provides the basis for further syntactic analyses, and many NLP tools rely on POS information as input. However, most POS annotation schemes have been developed with written (newspaper) text in mind and thus do not carry over well to text from other domains and genres. Recent discussions have concentrated on the shortcomings of present POS annotation schemes with regard to their applicability to data from domains other than newspaper text.

Sprachwandel und sprachliche Unsicherheit. Der formale und funktionale Wandel des Genitivs seit dem Frühneuhochdeutschen (2014)

Szczepaniak, Renata

Aus der Perspektive der Sprachbenutzerinnen ist der Genitiv vom Sprachverfall bedroht. Jedoch lässt sich in der Geschichte des Deutschen kein geradliniger Abbau nachweisen. Die kurze Genitivendung -s (aus -es) setzte sich zwar schon im Frühneuhochdeutschen als die häufigere Variante durch, im weiteren Sprachwandel entwickelte sich dann aber eine komplex gesteuerte Variation beider Endungen. Mit dem Abbau des verbalen und attributiven Genitivs gehen zwar wichtige Funktionsbereiche verloren, doch zeichnet sich in der neuesten Sprachgeschichte ein unerwarteter Aufbau des Genitivs als Präpositionalkasus ab. In diesem Beitrag wird dafür plädiert, dass die formale und funktionale Entwicklung des Genitivs stark durch sprachliche Unsicherheit beeinflusst wurde und wird, die eine Reaktion auf bestehende Varianz darstellt. Es wird dafür argumentiert, dass die stilistische Aufwertung der langen Genitivform und des Genitivs gegenüber dem Dativ den Sprach-wandel aufhält bzw. sogar in eine andere Richtung lenkt.

Sprachverfall durch internetbasierte Kommunikation? Linguistische Erklärungsansätze - empirische Befunde (2014)

Storrer, Angelika

Der Beitrag verortet die internetbasierten Kommunikationsformen in einem größeren sprach- und varietätengeschichtlichen Rahmen und macht deutlich, dass sich die neuen interaktionsorientierten Schreibformen — chatten, posten, twittern, skypen etc. — in einem Bereich etablieren, in dem bislang überwiegend mündlich kommuniziert wurde. Auf dieser Basis wird gezeigt, dass es bislang keine empirische Evidenz dafür gibt, dass der interaktionsorientierte Schreibstil auf das textorientierte Schreiben „abfärbt“, dass vielmehr kompetente Schreiber und selbst Jugendliche durchaus dazu in der Lage sind, situationsangemessen zwischen verschiedenen Schreibhaltungen und -stilen zu wechseln. Abschließend werden Desiderate für die korpusgestützte Begleitforschung zu diesen Entwicklungen formuliert und die Herausforderungen erläutert, die sich durch das Nebeneinander von interaktions- und textorientiertem Schreiben für die schulische Sprach- und Schreibförderung ergeben.

Von Kräften der deutschen Sprachkritik (2014)

Schrodt, Richard

Johann Leo Weisgerbers bekannter Titel bezieht sich auf Humboldts Energeia-Begriff, also auf die Sprache als wirkende Kraft. Auch in diesem Beitrag soll den wirkenden Kräften nachgegangen werden, freilich nicht als Unterstellung eines wesenhaften Sprachvermögens, sondern als Versuch, die wirksamen Motive der sprachkritischen Einstellungen, Publikationen und publizistischen Erscheinungen an einem Raster sozialwissenschaftlicher Begriffe darzulegen. An einigen ausgewählten Presseberichten und grammatischen Beispielen (Veränderungen im Bereich der deutschen Zeitenfolge) wird zunächst gezeigt, dass sich Sprachkritik oft schon von ihrem Gegenstand, der deutschen Sprache, weitgehend gelöst hat. Auch angesichts neuer Formen von substandardsprachlichen Erscheinungen (z.B. Jugendsprache, Jargon, Kiezsprache usw.) kann oft nachgewiesen werden, dass es sich in vielen Fällen um kommunikativ funktionale Sprachformen handelt. Um es schlagwortartig zusammenzufassen: Es gibt Sprachkritik ohne Sprache. Die „wirkenden Kräfte“ der Sprachkritik sichern vielmehr die Wahrnehmung gesellschaftlicher Differenzen und machen damit das Gefüge unterschiedlicher Lebensformen deutlich. Sie werden hier mit systemtheoretischen Begrifflichkeiten nach Niklas Luhmanns Theorie sozialer Systeme beschrieben und damit auch erklärt. Während das für die 80er-Jahre des vorigen Jahrhunderts charakteristische Programm der „Kritik der Sprachkritik“ auf eine sprachwissenschaftliche Aufklärung zielt, scheint heute vielmehr eine soziologische Aufklärung diese metakritische Funktion erfüllen zu können. Es könnte sich aber auch zeigen, dass Sprachkritik ihren Beitrag zur Stabilisierung des gesellschaftlichen Zusammenwirkens leistet — wenn man sie nicht als Sprachkritik im engeren Sinn versteht.

Sprachliche Identität und die Dynamik der deutschen Regionalsprachen (2014)

Schmidt, Jürgen Erich

Vielbeachtete neue Studien zeigen, dass zwischen aktuellem ökonomisch relevantem Handeln und den traditionellen Dialekträumen ein signifikanter Zusammenhang besteht. In dem Beitrag wird dieser Zusammenhang aus der Dynamik der modernen Regionalsprachen erklärt. Unter dem Druck der omnipräsenten Standardsprache wird einerseits das alte landschaftliche Hochdeutsch zum Regiolekt um- und abgewertet, andererseits hat sich im Regiolekt die alte sprachraumkonstituierende und identitätsstiftende Funktion der großräumigen Dialektlandschaften bewahrt. In Abhängigkeit von der diffusionsabweisenden oder diffusionslizensierenden Qualität sprachkognitiver Gegensätze fallen alte Dialektgrenzen mit den Grenzen regiolektaler Neuerungsräume zusammen. Da für die Sprecher die sprachkognitiven Gegensätze, die sich hinter den vermeintlich geringen, die Verstellbarkeit nicht behindernden Unterschieden zwischen benachbarten Regiolekten verbergen, nicht erkennbar sind, bewerten sie diese nicht linguistisch-regional, sondern emotional, auf der Beziehungsebene und ästhetisch. Die „mentalen Gegensätze“, die die Raumwahrnehmung konstituieren, beruhen auf empirisch zugänglichen sprachkognitiven Differenzen. Die kulturelle Identität hat — jedenfalls soweit es die modernen deutschen Regionalsprachen betrifft — eine direkte linguistische Basis.

Sprachverfall? Sprachliche Evolution am Beispiel des diachronen Funktionszuwachses des Apostrophs im Deutschen (2014)

Nübling, Damaris

In der emotional geführten Sprachverfallsdebatte wird besonders die Apostrophsetzung vor dem Genitiv- und dem Plural-t, vulgo Deppen-Apostroph, kritisiert und als vermeintliche Entlehnung aus dem Englischen stigmatisiert. Erst seit kurzem liegen mit Scherer (2010, 2013) korpusbasierte Untersuchungen vor, die eine angemessene Interpretation dieses graphematischen Wandels erlauben, der weitaus älter ist als gemeinhin vermutet. Generell erweist sich, dass viele als neu und bedrohlich empfundene Sprachveränderungen bereits vor über hundert Jahren meist ebenso emotional gegeißelt wurden. Der Beitrag befasst sich hauptsächlich mit der diachronen Entwicklung des phonographischen Apostrophs zu einem morphographischen, dessen Funktion nun nicht mehr darin besteht, nicht-artikulierte Laute zu markieren, sondern morphologische Grenzen (Uschis, Joseph K.’s, CD’s). Deutlich wird, dass der Apostroph der Gestaltschonung komplexer Basen dient, deren Gros aus Eigennamen besteht. Anschließend wird in einem kürzeren Teil nach der Entstehung und Beschaffenheit dieser s-Flexive selbst gefragt. Diese sind ihrerseits Ergebnis flexionsmorphologischer Umstrukturierungen und garantieren maximale Konstanthaltung des Wortkörpers. Abschließend wird noch die neueste Entwicklung gestreift, die in der Deflexion ebendieser s-Flexive besteht und die sich wieder am deutlichsten bei den Eigennamen manifestiert. Diese haben als Quelle all dieser Entwicklungen zu gelten (vgl. des Irak, des Helmut Kohl, auch des Perfekt, des LKW, des Gegenüber). Insgesamt ist festzustellen: Nicht nur die Apostrophsetzung vor s-Flexiven, sondern auch die s-Flexive selbst sowie ihr derzeitiger Abbau dienen ein und derselben Funktion: Der Schonung durch Konstanthaltung markierter Wortkörper, worunter mehrheitlich Eigennamen fallen, daneben auch Fremdwörter, Kurzwörter und Konversionen. Damit sind es die Eigennamen, die Ausgangspunkt und Ursache tiefgreifenden flexionsmorphologischen und graphematischen Wandels bilden.

Sprachvariation und Sprachwandel aus der Perspektive von Deutschlehrerinnen und Deutschlehrern. Einstellungsdaten aus Österreich, Deutschland und der Schweiz (2014)

Lenz, Alexandra N.

Der Fokus des Beitrags liegt auf Spracheinstellungen von Deutschlehrerinnen und Deutschlehrern an weiterführenden Schulen in Österreich, Deutschland und der Schweiz. Auf Basis einer aktuellen und großangelegten empirischen Studie wird der Frage nachgegangen, welche Einstellungen Lehrpersonen in den drei Ländern zu Variation und Wandel des Deutschen und seinen Varietäten haben. Neben der quantitativen und qualitativen Analyse von ausgewählten Einzelergebnissen setzt sich der Beitrag zum Ziel, mittels des Klassifizierungsverfahrens einer Clusteranalyse interindividuelle Einstellungsmuster herauszuarbeiten und diese — in einem zweiten Schritt — auf ihre soziodemographische Zusammensetzung hin zu analysieren.

Gibt es einen Kodex für die Grammatik des Neuhochdeutschen und, wenn ja, wie viele? Oder: Ein Plädoyer für Sprachkodexforschung (2014)

Klein, Wolf Peter

Sprachverfall wird in der öffentlichen Sprachdiskussion nicht selten mit der Unkenntnis oder Missachtung von sprachlichen Regeln in Verbindung gebracht. Als Instanzen, wo sich (explizite) Sprachregeln gesellschaftlich relevant verkörpern, können Sprachkodizes gesehen werden. Vor diesem Hintergrund wird im Text der Begriff des Sprachkodex in verschiedenen Dimensionen präzisiert und eine Subklassifikation in Kern- und Parakodex vorgeschlagen. Dem folgt ein Plädoyer für eine Sprachkodexforschung, in der die traditionell eher marginalen Perspektiven auf Sprachkodizes zu erweitern und zu systematisieren sind.

Lexikonstatistik 2.0 (2014)

Jäger, Gerhard

In der Mitte des 20. Jahrhunderts gab es diverse Versuche, die Klassifikation von Sprachen mit Hilfe von Wortlisten, die dem Grundvokabular der betreffenden Sprachen entnommen sind, zu automatisieren. Diese Methoden wurden und werden in der historischen Sprachwissenschaft gemeinhin kritisch diskutiert, da sich die erzielten Ergebnisse häufig als fehlerhaft erwiesen. In den letzten Jahren erleben wir einen neuen Aufschwung lexikostatistischer und glottochronologischer Ansätze. Deren Erfolgsaussichten sind heute wesentlich besser als vor einem halben Jahrhundert, da uns jetzt große Mengen an sprachvergleichenden Daten in elektronischer Form zur Verfügung stehen und die Computerlinguistik und Bioinformatik mächtige Werkzeuge bereitstellt, diese Daten statistisch auszuwerten. Im vorliegenden Artikel wird eine Fallstudie vorgestellt, die das Potenzial lexikostatistischer Methoden im 21. Jahrhundert illustriert.

Mit der Sprache ging es immer schon bergab. Dynamik, Wandel und Variation aus sprachhistorischer Perspektive (2014)

Durrell, Martin

Die Vorstellung eines Verfalls der deutschen Sprache lässt sich mindestens bis in das 16. Jahrhundert zurückverfolgen, als Schulmeister sich beschwert haben, dass ihre Schüler wegen der um sich greifenden Variation nicht mehr wüssten, was korrektes Deutsch sei. Ähnliche Vorstellungen treten etwa gleichzeitig in anderen europäischen Ländern auf und können vielleicht mit dem langsamen Ersatz des Lateins als vorherrschender Sprache des Schrifttums und der Bildung in Zusammenhang gebracht werden. Sie beruhen auf verbreiteten irrtümlichen Annahmen über das Wesen der Sprache, insbesondere dass die zugrundeliegende Form jeder Sprache homogen und unwandelbar sei und seit sehr langem — eventuell seit Babel — so existiert habe. Diese Annahmen muss man mit Watts (2011) als Mythen werten, sie sind jedoch sehr beharrlich, und in der frühen Neuzeit dienten sie als Grundlage für die Erschaffung der heutigen deutschen Standardsprache, die aus diesem Grunde genauso wie alle anderen europäischen Kultur- oder Standardsprachen eigentlich als ein rezentes kulturelles Artefakt anzusehen ist. In diesem Beitrag wird anhand von Material aus einem neuen elektronischen Korpus der deutschen Sprache des 17. und 18. Jahrhunderts gezeigt, wie die Standardsprache entstanden ist als Ergebnis dieser Annahmen sowie aus der Vorstellung, nur auf diese Weise sei die deutsche Sprache vor dem endgültigen Verfall zu retten. Im Laufe dieses Vorgangs wurde wo möglich jede Variation aus der Schriftsprache eliminiert und dabei auch sprachliche Varianten stigmatisiert, die heute noch häufig sind, auch wenn sie als „substandard“, „nicht korrekt“ oder „nicht hochsprachlich“ gelten. Auch wurden Regeln des „guten“ hochdeutschen Sprachgebrauchs festgelegt (oder erdacht), die Muttersprachler im spontanen Gespräch immer noch kaum beachten. Aber die Sprachgeschichte lehrt, dass Variation und Wandel nicht zum Verfall der Sprache führen, sondern die dynamische Flexibilität gewährleisten, die für die Sprache nötig ist, wenn sie allen sozial und kulturell erforderlichen Bedürfnissen der menschlichen Kommunikation gerecht werden muss.

Die Sprachnormfrage im Deutschunterricht: das Dilemma der Lehrenden (2014)

Davies, Winifred

Üblicherweise wird behauptet und erwartet, dass für den Deutschunterricht die deutsche Standardsprache zumindest als Zielsprache, wenn nicht gar als Unterrichtssprache gilt. Die Forschungen der germanistischen Soziolinguistik und Sprachlehrforschung zeigen allerdings, dass keinesfalls Einigkeit darüber besteht, was denn ,die deutsche Standardsprache‘ überhaupt sei, ob, und wenn ja, wie viel Variation sie beinhaltet, und wie mit Normabweichungen seitens der Schüler/innen umzugehen sei. Unser Beitrag beschäftigt sich mit der Rolle der Deutschlehrenden — sowohl an deutschsprachigen Schulen als auch im Rahmen des DaF-Unterrichts an britischen Hochschulen — um zu erörtern, welche Erwartungen sie an die sprachliche Normenkonformität ihrer Schüler/innen haben und welche praktischen Probleme ihnen hierbei begegnen. Unterstützt durch historische Belege aus dem Schulalltag im 19. Jahrhundert, diskutieren wir Kontinuitäten und Innovationen in der Selbsteinschätzung von Deutsch- und DaF-Lehrer/innen zu ihrer Rolle als Sprachnormvermittler/ innen und stellen die Frage, wie groß ihre Rolle tatsächlich ist.

Einführung (2014)

Engel, Ulrich

Sachen charakterisieren (2014)

Gołębiowski, Adam ; Engel, Ulrich

Das Politische als „konstitutives Außen“ des Ökonomischen. Grenzziehungen zwischen ‚Wirtschaft‘ und ‚Politik‘ in historischer Perspektive (2014)

Scholl, Stefan

Selbstvergewisserung. Die Evidenz des Ökonomischen als Effekt der Grenzziehung zwischen 'Wirtschaft', 'Wissenschaft' und 'Politik' (2014)

Scholl, Stefan

Heinz Vater: Referenz. Bezüge zwischen Sprache und Welt [Rezension] (2014)

Wimmer, Rainer

D´Audo, d´Keffer, d´Kuchine: alemannische Substantivmorphologie am Beispiel des Schuttertäler Ortsdialekts (2014)

Kopf, Kristin

Deutsche Akademie für Sprache und Dichtung, Union der deutschen Akademien der Wissenschaften (Hg.). 2013. Reichtum und Armut der deutschen Sprache. Erster Bericht zur Lage der deutschen Sprache. Berlin, Boston: De Gruyter. 233 S. Teil II. Der Bericht zur Lage der deutschen Sprache im Kontext sprachwissenschaftlicher Öffentlichkeitsarbeit [Rezension] (2014)

Stefanowitsch, Anatol ; Kopf, Kristin ; Flach, Susanne

Deutsche Akademie für Sprache und Dichtung, Union der deutschen Akademien der Wissenschaften (Hg.). 2013. Reichtum und Armut der deutschen Sprache. Erster Bericht zur Lage der deutschen Sprache. Berlin, Boston: De Gruyter. 233 S. Teil I. Anspruch und Ziele des Berichts [Rezension] (2014)

Kopf, Kristin ; Flach, Susanne ; Stefanowitsch, Anatol

Digitale Gesellschaft - Partizipationskulturen im Netz. Zur Einleitung (2014)

Einspänner-Pflock, Jessica ; Dang-Anh, Mark ; Thimm, Caja

Using a GIS for search and visualization of literary works in the digital humanities (2014)

Schiller, Ines ; Entrup, Bastian ; Binder, Frank ; Schaarschmidt, Sandra ; Lobin, Henning

This paper presents challenges and opportunities resulting from the application of geographical information systems (GIS) in the (digital) humanities. First, we provide an overview of the intersection and interaction between geography (and cartography), and the humanities. Second, the “GeoBib” project is used as a case study to exemplify challenges for such collaborative, interdisciplinary projects, both for the humanists and the geoscientists. Finally, we conclude with an outlook on further applications of GIS in the humanities, and the potential scientific benefit for both sides, humanities and geosciences.

Posterbeitrag: GeoBib - Visualisierung von historischen Karten und Werken in einem WebGIS (2014)

Schiller, Ines ; Schaarschmidt, Sandra ; Entrup, Bastian ; Lobin, Henning

Schreiben nach Engelbart (2014)

Lobin, Henning

Douglas Engelbart hat 1968 mit seinem On-Line System das erste Mal gezeigt, wie ein Computer als interaktives Schreibwerkzeug genutzt werden kann. Der Beitrag zeichnet diese Urszene der Textverarbeitung nach, beschreibt die wesentlichen Entwicklungslinien, die das digitale Schreiben seitdem genommen hat, und erläutert die zentralen Konzepte, die es zunehmend prägen: Hybridität, Multimedialität und Sozialität. Der folgende Artikel ist ein bearbeiteter Auszug aus Henning Lobins “Engelbarts Traum. Wie der Computer uns Lesen und Schreiben abnimmt” Frankfurt am Main / New York: Campus, 2014.

GeoBib – Visualisierung von historischen Karten in einem WebGIS (2014)

Schiller, Ines ; Schaarschmidt, Sandra ; Lobin, Henning

Dieser Artikel gibt einen Einblick in das GeoBib-Projekt und die Problematik der Verwendung von historischen Karten und der daraus abgeleiteten Geodaten in einem WebGIS. Das GeoBib-Projekt hat zum Ziel, eine annotierte und georeferenzierte Online-Bibliographie der frühen deutsch- bzw. polnischsprachigen Holocaust- und Lagerliteratur von 1933 bis 1949 bereitzustellen. Zu diesem Zeitraum werden historische Karten und Geodaten gesammelt, aufbereitet und im zugehörigen WebGIS des GeoBib-Portals visualisiert. Eine Besonderheit ist die aufwendige Recherche von Geodaten und Kartenmaterial für den Zeitraum zwischen 1933 und 1949. Die Problematiken bezüglich der Recherche und späteren Visualisierung historischer Geodaten und des Kartenmaterials sind ein Hauptaugenmerk in diesem Artikel. Weiterhin werden Konzepte für die Visualisierung von historischem, unvollständigem Kartenmaterial präsentiert und ein möglicher Lösungsweg für die bestehenden Herausforderungen aufgezeigt.

Uncertain about Uncertainty: Different ways of processing fuzziness in digital humanities data (2014)

Binder, Frank ; Entrup, Bastian ; Schiller, Ines ; Lobin, Henning

The GeoBib project is constructing a georeferenced online bibliography of early Holocaust and camp literature published between 1933 and 1949 (Entrup et al. 2013a). Our immediate objectives include identifying the texts of interest in the first place, composing abstracts for them, researching their history, and annotating relevant places and times. Relations between persons, texts, and places will be visualized using digital maps and GIS software as an integral part of the resulting GeoBib information portal. The combination of diverse data from varying sources not only enriches our knowledge of these otherwise mostly forgotten texts; it also confronts us with vague, uncertain or even conflicting information. This situation yields challenges for all researchers involved – historians, literary scholars, geographers and computer scientists alike. While the project operates at the intersection of historical and literary studies, the involved computer scientists are in charge of providing a working environment (Entrup et al. 2013b) and processing the collected information in a way that is formalized yet capable of dealing with inevitable vagueness, uncertainty and contradictions. In this paper we focus on the problems and opportunities of encoding and processing fuzzy data.

Computer-Assisted Content Analysis of Twitter Data (2014)

Einspänner, Jessica ; Dang-Anh, Mark ; Thimm, Caja

Content analysis provides a useful and multifaceted, methodological framework for Twitter analysis. CAQDAS tools support the structuring of textual data by enabling categorising and coding. Depending on the research objective, it may be appropriate to choose a mixed-methods approach that combines quantitative and qualitative elements of analysis and plays out their respective advantages to the greatest possible extent while minimising their shortcomings. In this chapter, we will discuss CAQDAS speech act analysis of tweets as an example of software-assisted content analysis. We start with some elementary thoughts on the challenges of the collection and evaluation of Twitter data before we give a brief description of the potentials and limitations of using the software QDA Miner (as one typical example for possible analysis programmes). Our focus will lie on analytical features that can be particularly helpful in speech act analysis of tweets.

Twitter Analytics (2014)

Bürger, Tobias ; Dang-Anh, Mark

Die Online-Forschung setzt sich in den letzten Jahren zunehmend mit Mikro-Blogs, insbesondere dem weltweit populärsten Anbieter Twitter, auseinander. Verschiedenste Disziplinen beschäftigen sich aus ihren jeweiligen Perspektiven mit der Analyse von kommunikativen Prozessen und Strukturen von Twitter und nutzen dabei eine Vielzahl an methodischen Zugängen. In diesem Artikel werden zunächst die grundlegenden Funktionen, Möglichkeiten des Zugangs zur Datenstruktur sowie Methoden der Datenerhebung und -auswertung dargelegt. Im Anschluss werden Ansätze verschiedener Fachdisziplinen vorgestellt.

Repräsentierendes Debattieren. Zur Mediatisierung von innerparteilicher Demokratie (2014)

Scheffer, Thomas ; Dang-Anh, Mark ; Laube, Stefan ; Thimm, Caja

Claudia Fraas / Stefan Meier / Christian Pentzold (Hrsg.): Online-Diskurse. Theorien und Methoden transmedialer Online-Diskursforschung [Rezension] (2014)

Dang-Anh, Mark

Mediatized Politics - Structures and Strategies of Discursive Participation and Online Deliberation on Twitter (2014)

Thimm, Caja ; Dang-Anh, Mark ; Einspänner, Jessica

„Nackt“ im Netz? Über Datenspuren und selektive Distribution in digitalen Medien (2014)

Dang-Anh, Mark

Wie selbstbestimmt können wir das Internet nutzen? Wie viel wissen wir darüber,welche digitalen Spuren wir setzen und wer diesen hinterher spürt? Wie werden die beim Surfen erzeugten Daten von Dritten weiter verwendet – mit und ohne unser Wissen? Und ist die gefühlte Nacktheit in Zeiten der digital ausspähbaren, scheinbaren Transparenz wirklich akut oder durch traditionelle analoge Denk- und Erfahrungsstrukturen geprägt?

Die dependenzielle Verbgrammatik (DVG) (2014)

Engel, Ulrich

Die Graduiertenplattform des Forschungsnetzwerks "Sprache und Wissen" (2014)

Jacob, Katharina ; Schedl, Evi ; Müller, Marcus

Interdisziplinäre Forschungsarbeit im Netzwerk. Brücken bauen. Ein Interview mit Prof. Dr. Thomas Spranz-Fogasy und PD Dr. med. Christoph Nikendei, MME, geführt von Maria Becker und Evi Schedl. "Jeder hat seine Kontexte und Erlebenswelten - wir müssen Brücken zwischen diesen bauen." (2014)

Becker, Maria ; Schedl, Evi

Diskurszukünfte. 10. Jahrestagung des Forschungsnetzwerks "Sprache und Wissen". Jubiläums- und Programmzeitschrift (2014)

Einleitung (2014)

Kämper, Heidrun

“My Curiosity was Satisfied, but not in a Good Way”: Predicting User Ratings for Online Recipes (2014)

Liu, Can ; Guo, Chun ; Dakota, Daniel ; Rajagopalan, Sridhar ; Li, Wen ; Kübler, Sandra ; Yu, Ning

In this paper, we develop an approach to automatically predict user ratings for recipes at Epicurious.com, based on the recipes’ reviews. We investigate two distributional methods for feature selection, Information Gain and Bi-Normal Separation; we also compare distributionally selected features to linguistically motivated features and two types of frameworks: a one-layer system where we aggregate all reviews and predict the rating vs. a two-layer system where ratings of individual reviews are predicted and then aggregated. We obtain our best results by using the two-layer architecture, in combination with 5 000 features selected by Information Gain. This setup reaches an overall accuracy of 65.60%, given an upper bound of 82.57%.

Parsing German: How Much Morphology Do We Need? (2014)

Maier, Wolfgang ; Kübler, Sandra ; Dakota, Daniel ; Whyatt, Daniel

We investigate how the granularity of POS tags influences POS tagging, and furthermore, how POS tagging performance relates to parsing results. For this, we use the standard “pipeline” approach, in which a parser builds its output on previously tagged input. The experiments are performed on two German treebanks, using three POS tagsets of different granularity, and six different POS taggers, together with the Berkeley parser. Our findings show that less granularity of the POS tagset leads to better tagging results. However, both too coarse-grained and too fine-grained distinctions on POS level decrease parsing performance.

Literatur zur Medizinischen Kommunikation [Bibliografie] (2014)

Spranz-Fogasy, Thomas ; Becker, Maria ; Menz, Florian ; Nowak, Peter

POS error detection in automatically annotated corpora (2014)

Rehbein, Ines

Recent work on error detection has shown that the quality of manually annotated corpora can be substantially improved by applying consistency checks to the data and automatically identifying incorrectly labelled instances. These methods, however, can not be used for automatically annotated corpora where errors are systematic and cannot easily be identified by looking at the variance in the data. This paper targets the detection of POS errors in automatically annotated corpora, so-called silver standards, showing that by combining different measures sensitive to annotation quality we can identify a large part of the errors and obtain a substantial increase in accuracy.

Cleaning the Europarl Corpus for Linguistic Applications (2014)

Graën, Johannes ; Batinić, Dolores ; Volk, Martin

We discovered several recurring errors in the current version of the Europarl Corpus originating both from the web site of the European Parliament and the corpus compilation based thereon. The most frequent error was incompletely extracted metadata leaving non-textual fragments within the textual parts of the corpus files. This is, on average, the case for every second speaker change. We not only cleaned the Europarl Corpus by correcting several kinds of errors, but also aligned the speakers’ contributions of all available languages and compiled every- thing into a new XML-structured corpus. This facilitates a more sophisticated selection of data, e.g. querying the corpus for speeches by speakers of a particular political group or in particular language combinations.

Comparison of Pitch Range and Pitch Variation in Slavic and Germanic Languages (2014)

Andreeva, Bistra ; Demenko, Grazyna ; Wolska, Magdalena ; Möbius, Bernd ; Zimmerer, Frank ; Jügler, Jeanin ; Oleskowicz-Popiel, Magdalena ; Trouvain, Jürgen

This study presents the results of a large-scale comparison of various measures of pitch range and pitch variation in two Slavic (Bulgarian and Polish) and two Germanic (German and British English) languages. The productions of twenty-two speakers per language (eleven male and eleven female) in two different tasks (read passages and number sets) are compared. Significant differences between the language groups are found: German and English speakers use lower pitch maxima, narrower pitch span, and generally less variable pitch than Bulgarian and Polish speakers. These findings support the hypothesis that inguistic communities tend to be characterized by particular pitch profiles.

Too cautious to vary more? A comparison of pitch variation in native and non-native productions of French and German speakers (2014)

Zimmerer, Frank ; Jügler, Jeanin ; Andreeva, Bistra ; Möbius, Bernd ; Trouvain, Jürgen

This article presents preliminary results indicating that speakers have a different pitch range when they speak a foreign language compared to the pitch variation that occurs when they speak their native language. To this end, a learner corpus with French and German speakers was analyzed. Results suggest that speakers indeed produce a smaller pitch range in the respective L2. This is true for both groups of native speakers. A possible explanation for this finding is that speakers are less confident in their productions, therefore, they concentrate more on segments and words and subsequently refrain from realizing pitch range more native-like. For language teaching, the results suggest that learners should be trained extensively on the more pronounced use of pitch in the foreign language.

Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process (2014)

Fauth, Camille ; Bonneau, Anne ; Zimmerer, Frank ; Trouvain, Jürgen ; Andreeva, Bistra ; Colotte, Vincent ; Fohr, Dominique ; Jouvet, Denis ; Jügler, Jeanin ; Laprie, Yves ; Mella, Odile ; Möbius, Bernd

We present the design of a corpus of native and non-native speech for the language pair French-German, with a special emphasis on phonetic and prosodic aspects. To our knowledge there is no suitable corpus, in terms of size and coverage, currently available for the target language pair. To select the target L1-L2 interference phenomena we prepare a small preliminary corpus (corpus1), which is analyzed for coverage and cross-checked jointly by French and German experts. Based on this analysis, target phenomena on the phonetic and phonological level are selected on the basis of the expected degree of deviation from the native performance and the frequency of occurrence. 14 speakers performed both L2 (either French or German) and L1 material (either German or French). This allowed us to test, recordings duration, recordings material, the performance of our automatic aligner software. Then, we built corpus2 taking into account what we learned about corpus1. The aims are the same but we adapted speech material to avoid too long recording sessions. 100 speakers will be recorded. The corpus (corpus1 and corpus2) will be prepared as a searchable database, available for the scientific community after completion of the project.

Differences of Pitch Profiles in Germanic and Slavic Languages (2014)

Andreeva, Bistra ; Demenko, Grazyna ; Möbius, Bernd ; Zimmerer, Frank ; Jügler, Jeanin ; Oleskowicz-Popiel, Magdalena

This study investigates cross-language differences in pitch range and variation in four languages from two language groups: English and German (Germanic) and Bulgarian and Polish (Slavic). The analysis is based on large multi-speaker corpora (48 speakers for Polish, 60 for each of the other three languages). Linear mixed models were computed that include various distributional measures of pitch level, span and variation, revealing characteristic differences across languages and between language groups. A classification experiment based on the relevant parameter measures (span, kurtosis and skewness values for pitch distributions for each speaker) succeeded in separating the language groups.

Dimensions of Methaphorical Meaning (2014)

Gargett, Andrew ; Ruppenhofer, Josef ; Barnden, John

Recent work suggests that concreteness and imageability play an important role in the meanings of figurative expressions. We investigate this idea in several ways. First, we try to define more precisely the context within which a figurative expression may occur, by parsing a corpus annotated for metaphor. Next, we add both concreteness and imageability as “features” to the parsed metaphor corpus, by marking up words in this corpus using a psycholinguistic database of scores for concreteness and imageability. Finally, we carry out detailed statistical analyses of the augmented version of the original metaphor corpus, cross-matching the features of concreteness and imageability with others in the corpus such as parts of speech and dependency relations, in order to investigate in detail the use of such features in predicting whether a given expression is metaphorical or not.

Vorwort (2014)

Stickel, Gerhard

Introduction/Einführung/Įžanga (2014)

Stickel, Gerhard

Following a welcome in Lithuanian and English to the guests and members on the occa- sion of the 10"’ anniversary of EFNIL, the history of this European language Organization is sketched. A brief survey of the sociolinguistic themes treated at previous Conferences and the state of the inajor projects is given, followed by an introduction (in German) to the general topic of the present Conference. The importance that translation and interpretation have for European language diversity and the individual national languages beside foreign language education of all Europeans is being stressed.

Körper(-Darstellungen) im Reality-TV. Herstellung von Wirklichkeit im und über das Fernsehen hinaus (2014)

Klug, Daniel ; Schmidt, Axel

Die fremdsprachige Produktionssituation im Fokus eines onomasiologisch-konzeptuell orientierten, zweisprachig-bilateralen Wörterbuches für das Sprachenpaar Deutsch - Spanisch: Theoretische und methodologische Grundlagen von DICONALE (2014)

Meliss, Meike

Der Beitrag beschäftigt sich mit den verschiedenen Such-, Auffindungs- und Auswahlsprozessen, die für die fremdsprachige Produktion notwendig sind und von DICONALE-online, einem onomasiologisch-konzeptuell ausgerichteten, zweisprachig-bilateral konzipierten Verbwörterbuch der spanischen und deutschen Gegenwartsspache, besonders berücksichtigt werden. Der Ausgangspunkt von DICONALE ist ein unbefriedigendes Informationsangebot in den bestehenden ein- und zweisprachigen Lernerwörterbüchern für den L2-output und bestätigt das Projektteam in der Notwendigkeit, ein neuartiges benutzer- und situationsdefiniertes online-Nachschlagewerk zu erstellen. Zwei Bezugsrahmen bilden die Grundlage für einen komplexen, konzeptuell und framegeleiteten Zugriffspfad, der dem Benutzer bei der Suche und Auswahl von Ausdrucksmöglichkeiten und der adäquaten Anwendung behilflich sein soll. Das Novum dieses Wörterbuchprojekts besteht hauptsachlich darin, eine onomasiologisch-konzeptuelle Perspektive für den fremdsprachigen Produktionsprozess nutzbar zu machen und mit einem semasiologischen Zugriff zu verbinden, durch den es möglich ist, die inter- und intralingualen Unterschiede zwischen den Lexemen eines lexikalisch-semantischen (Sub)Paradigmas hervorzuheben. Ziel des Beitrages ist es daher, den Ausgangspunkt, sowie die theoretischen und methodologischen Grundlagen von DICONALE-online unter der speziellen Perspektive der Benutzer- und Situationsorientiertheit zur Diskussion zu stellen, die einzelnen Zugriffspfade für den Such- und Auffindungsprozess vorzustellen und das Angebot zur Auswahl und zum adäquaten Gebrauch aus inter- und intralingualer Perspektive zu präsentieren.

Open Access

ja

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

316 search hits