Refine
Year of publication
- 2011 (252) (remove)
Document Type
- Part of a Book (115)
- Article (68)
- Conference Proceeding (27)
- Book (21)
- Other (6)
- Part of Periodical (5)
- Contribution to a Periodical (3)
- Doctoral Thesis (2)
- Review (2)
- Bachelor Thesis (1)
Language
Keywords
- Deutsch (138)
- Korpus <Linguistik> (29)
- Grammatik (18)
- Computerlinguistik (17)
- Computerunterstützte Lexikographie (14)
- Konversationsanalyse (14)
- Online-Wörterbuch (13)
- Sprachvariante (13)
- Wörterbuch (13)
- Englisch (11)
Publicationstate
- Veröffentlichungsversion (104)
- Zweitveröffentlichung (24)
- Postprint (10)
- (Verlags)-Lektorat (1)
- Erstveröffentlichung (1)
- Preprint (1)
Reviewstate
Publisher
- Institut für Deutsche Sprache (38)
- Narr (30)
- de Gruyter (29)
- Lang (11)
- Springer (6)
- Trojina, Institute for Applied Slovene Studies (5)
- Verlag für Gesprächsforschung (5)
- De Gruyter (3)
- Groos (3)
- Universidade de Santiago de Compostela (3)
Das Thema "Konnektoren" stößt in letzter Zeit sowohl in funktionalen als auch in formalen Arbeiten auf großes Interesse. Das Hauptanliegen des vorliegenden Bandes besteht aus dem Bemühen, einen weiten Blickwinkel anzubieten - sowohl hinsichtlich des theoretischen Rahmens, in dem die einzelnen Beiträge entstanden sind, als auch der Auswahl der Schwerpunkte: Er vereint breit angelegte theoretische Beiträge mit solchen, die sich vorwiegend mit einer semantischen Gruppe von Konnektoren auseinandersetzen. Den Schwerpunkt bilden hierbei Kausalkonnektoren. Darüber hinaus widmen sich einzelne Beiträge den Temporalkonnektoren und Adverbkonnektoren aus verschiedenen Perspektiven oder untersuchen Konnektoren sprachvergleichend in unterschiedlichen Kontexten.
Dabei zielen alle Beiträge trotz verschiedenartiger Theorieansätze darauf ab, verschiedene Klassen von Konnektoren in einer Weise zu analysieren, die in einer operationalisierbaren Methode etwa im DaF-Bereich angewendet werden.
Das Werk versteht sich als eine Darstellung der wichtigsten syntaktischen, prosodischen, semantischen und pragmatischen Eigenschaften kausaler und konditionaler Konnektoren des gesprochenen Deutsch.
Die Untersuchung formuliert notwendige theoretische Grundlagen und zeigt die komplexe Interaktion mehrerer Faktoren, die sich auf die Interpretation einer Äußerung auswirken. Empirische Daten belegen, dass die kontextuelle und pragmatische Interpretation der untersuchten Relationen stark mit ihren syntaktischen und prosodischen Mustern korreliert. Jedoch handelt es sich nicht um eine Eins-zu-eins-Beziehung, denn gleiche Lesarten können von kausalen und konditionalen Relationen unterschiedlich markiert sein. Anhand der Ergebnisse wird das Verhältnis zwischen Konditionalität und Kausalität diskutiert.
Im Beitrag werden die Methodologie und die Ziele eines Projekts vorgestellt, das anstrebt, auf der Grundlage eines breiten Korpus von Texten aus allen Ländern und Regionen des zusammenhängenden deutschen Sprachgebiets die Variation in der Grammatik der geschriebenen deutschen Standardsprache zu erfassen, in einem Handbuch zu dokumentieren und damit eine Basis sowohl für Grammatiken als auch für weitergehende grammatische Untersuchungen zu schaffen. Nach einleitenden Bemerkungen zum Projekt und zu der Frage, in welcher Relation die geplante „Variantengrammatik des Standarddeutschen“ zum bereits erhältlichen „Variantenwörterbuch des Deutschen“ von Ammon et al. (2004) steht, folgt ein Forschungsüberblick zur grammatischen Variation in der Standardsprache. Dann werden Beispiele für grammatische Variabilität in verschiedenen Phänomenbereichen gegeben, und es wird anhand von zwei Fallbeispielen gezeigt, wie eine grammatische Beschreibung dieser Phänomene aussehen kann. Um Angaben zur arealen Distribution grammatischer Varianten machen zu können, wird den Analysen ein Korpus zugrunde gelegt, das sich auf den geschriebenen Standard beschränkt und darunter den Sprachgebrauch in der Presse fasst. Das Korpus, das als Basis für die Erstellung der geplanten Variantengrammatik dient, wird im Beitrag kurz vorgestellt, außerdem wird erläutert, welche Zielsetzungen mit einer solchen Grammatik verbunden sind.
This study explores the interdependence of qualitative and quantitative analysis in articulating empirically plausible and theoretically coherent generalizations about grammatical structure. I will show that the use of large electronic corpora is indispensable to the grammarian's work, serving as a rich source of semantic and contextual information, which turns out to be crucial in categorizing and explaining grammatical forms. These general concerns are illustrated by the patterns of use of Czech relative clauses (RC) with the non-declinable relativizer co, by taking a set of existing claims about these RCs and testing their accuracy on corpus material. The relevant analytic categories revolve around the referential type of the relativized noun, the interaction between relativization and deixis, and the semantic relationship between the relativized noun and the proposition expressed by the RC. The analysis demonstrates that some of the existing claims are fully invalid in the face of regularly attested semantic distinctions, while others are more or less on the right track but often not comprehensive or precise enough to capture the full richness of the facts. 1
Conversation is usually considered to be grammatically simple, while academic writing is often claimed to be structurally complex, associated primarily with a greater use of dependent clauses. Our goal in the present paper is to challenge these stereotypes, based on the results of large-scale corpus investigations. We argue that both conversation and professional academic writing are grammatically complex but that their complexities are dramatically different. Surprisingly, the traditional view that complexity is realized through extensive clausal embedding leads to the conclusion that conversation is more complex than academic writing. In contrast, written academic discourse is actually much more ‘compressed’ than elaborated, and the complexities of academic writing are realized mostly as phrasal embedding rather than embedded clauses.
Der vorliegende Beitrag hat zum Ziel, drei der wichtigsten pragmatischen Aspekte der Sprechgattung Ermahnung im Deutschen und im Ukrainischen einer ausführlichen Analyse zu unterziehen, wobei es vor allem um die folgenden Aspekte geht:
- das kommunikative Ziel des Sprechers
- das Modell des Sprechers
- das Modell des Empfängers.
Um festzustellen, wie die Sprecher selbst Ermahnungen identifizieren und verwenden, wurde ein assoziatives Experiment unter 120 deutschen Germanistikstudenten an der Friedrich-Alexander-Universität Erlangen-Nürnberg (Deutschland) und 120 Ukrainistikstudenten an der Nationalen Iwan-Franko Universität Lwiw (Ukraine) durchgeführt. Es ist vor allem der Versuch unternommen worden, das kommunikative Ziel des Sprechers sowie Modelle des Sprechers und des Empfängers in unterschiedlichen Situationen zu bestimmen, in denen Ermahnungen realisiert werden.
This article looks at Latgalian from a perspective of a classification of languages. It starts by discussing relevant terms relating to sociolinguistic language types. It argues that Latgalian and its speakers show considerable similarities with many languages in Europe which are considered to be regional languages – hence, also Latgalian should be classified as such. In a second part, the article uses sociolinguistic data to indicate that the perceptions of speakers confirm this classification. Therefore, Latgalian should also officially be treated with the respect that other regional languages in Europe enjoy.
In der akademischen Diskussion zum Global English hat sich seit den 1980er Jahren ein Modell etabliert, das die Staaten, in denen Englisch gesprochen wird, idealtypisch in drei Kreise einteilt: Den Inneren Kreis, in dem Englisch wichtigste Sprache der Gesellschaft sowie L1 eines Großteils der Bevölkerung ist, den Äußeren Kreis, wo Englisch L2 und eine wichtige Sprache unter mehreren ist, sowie den Erweiterten oder Expandierenden Kreis, in dem Englisch als Fremdsprache und als Lingua Franca dominiert (Kachru, 1985). Dieser Beitrag zeigt anhand einer Bestandsaufnahme gesellschaftlicher Funktionen des Deutschen weltweit, dass dieses Modell auch auf das Deutsche übertragen werden kann. Allerdings unterscheidet sich das Deutsche in einigen erheblichen Aspekten vom Englischen: Zum Inneren Kreis gehören die Länder des deutschsprachigen Kerngebietes, zum Äußeren Kreis Länder, in denen Deutsch anerkannte Minderheitensprache ist, und zum Erweiterten (oder im Falle des Deutschen eher Bröckelnden) Kreis Länder, in denen es einzelne deutsche Sprachinseln oder eine deutschsprachige Diaspora gibt, wobei letztere auch erst in jüngster Zeit entstanden sein kann. Schließlich diskutiert der Aufsatz die Position des Baltikums in diesem Modell.
An interactive, dynamic electronic dictionary aimed at text production should guide the user in innovative ways, especially in respect of difficult, complicated or confusing issues. This paper proposes a design for bilingual dictionaries intended to guide users in text production; we focus on complex phenomena of the interaction between lexis and grammar. It will be argued that a dictionary aimed at guiding the user in lexical selection should implement a type of “decision algorithm”. In addition, it should flag incorrect solutions and should warn against possible wrong generalisations of (foreign) language learners. Our proposals will be illustrated with examples from several languages, as the design principles are generally applicable. The copulative construction which is regarded as the most complicated grammatical structure in Northern Sotho will be analyzed in more detail and presented as a case in point.
Between classical symbolic word sense disambiguation (wsd) using explicit deep semantic representations of sentences and texts and statistical wsd using word co-occurrence information, there is a recent tendency towards mediating methods. Similar to so-called lightweight semantics (Marek, 2009) we suggest to only make sparse use of semantic information. We describe an approximation model based upon flat underspecified discourse representation structures (FUDRSs, cf. Eberle, 2004) that weighs knowledge about context structure, lexical semantic restrictions and interpretation preferences. We give a catalogue of guidelines for human annotation of texts by corresponding indicators. Using this, the reliability of an analysis tool that implements the model can be tested with respect to annotation precision and disambiguation prediction and how both can be improved by bootstrapping the knowledge of the system using corpus information. For the balanced test corpus considered the recognition rate of the preferred reading is 80-90% (depending on the smoothing of parse errors).
The article aims to examine grammatical features and pragmatic concerns of communicating in the Sciences. In the research of certain languages, it became common to explaingrammatical features such as the usage of passive voice and nominal structures by communication requirements such as objectivity and precision. With the assumption that communication in Science is designed to help gain and spread new insight, the authors tried to integrateseveral approaches to pragmatic and grammatical features of communication. By discussing therelationship between the grammar of certain languages and of the corresponding commonlanguage, the article also places the subject of communication in the Sciences in the discipline oflanguage Variation.
The Lyon’s team research task consists in the study of the way in which multilingual resources are mobilized in team work within collaborative activities; how they are exploited in a specific way in order both to enhance collaboration and to respect the specificities of the members’ linguistic competences and practices within the team. Central to our analytical work, which is inspired by ethnomethodological conversation analysis, is the relationship between multilingual resources and the situated organization of linguistic uses and of social practices.
This paper aims at contributing to the analysis of overlaps in turns-at-talk from both a sequential and a multimodal perspective. Overlaps have been studied within Conversation Analysis by focusing mainly on verbal and vocal resources; taking into account multimodal resources such as gesture, bodily posture, and gaze contributes to a better understanding of participants’ orientations to the sequential organization of overlapping talk and their management of speakership. First, we introduce the way in which overlaps have been studied in Conversation Analysis, mainly by Jefferson (1973, 1983, 2004) and Schegloff (2000); then we propose possible implications of their multimodal analysis. In order to demonstrate that speakers systematically orient to the overlap onset and resolution we analyze the multimodal conduct of overlapped speakers. Findings show methodical variations in trajectories of overlap resolution: speakers’ gestures in overlap display themselves as maintaining or withdrawing their turn, thereby exhibiting the speakership achieved and negotiated during overlap.
This paper offers a detailed analysis of the opening of an international meeting. English Lingua Franca as the official language of the meeting is actively discussed and negotiated by the participants. The analysis highlights the issues identified by the participants themselves in choosing a linguistic regime for their professional exchanges. The English Lingua Franca regime is aimed at facilitating the participation of some of the participants, but creates problems for others, too. The chairman deals with this situation in an embodied way (through his gaze, gesture, bodily postures, and by the way in which he walks through the room), displaying that he orients to different member categories (such as 'anglophone', 'anglophone who can understand French', 'francophile', etc.) as benefitting from or resisting against the definitive language choice.
Linguistics is facing the challenge of many other sciences as it continues to grow into increasingly complex subfields, each with its own separate or overarching branches. While linguists are certainly aware of the overall structure of the research field, they cannot follow all developments other than those of their subfields. It is thus important to help specialists but also newcomers alike to bushwhack through evolved or unknown territory of linguistic data. A considerable amount of research data in linguistics is described with metadata. While studies described and published in archived journals and conference proceedings receive a quite homogeneous set of metadata tags — e.g., author, title, publisher —, this does not hold for the empirical data and analyses that underlie such studies. Moreover, lexicons, grammars, experimental data, and other types of resources come in different forms; and to make things worse, their description in terms of metadata is also not uniform, if existing at all. These problems are well-known and there are now a number of international initiatives — e.g., CLARIN, FlareNet, MetaNet, DARIAH — to build infrastructures for managing linguistic resources. The NaLiDa project, funded by the German Research Foundation, aims at facilitating the management and access to linguistic resources originating from German research institutions. In cooperation with the German SFB 833 research center, we are developing a combination of faceted and full-text search to give integrated access through heterogeneous metadata sets. Our approach is supported by a central registry for metadata field descriptors, and a component repository for structured groups of data categories as larger building blocks.
This paper uses a devil’s advocate position to highlight the benefits of metadata creation for linguistic resources. It provides an overview of the required metadata infrastructure and shows that this infrastructure is in the meantime developed by various projects and hence can be deployed by those working with linguistic resources and archiving. Possible caveats of metadata creation are mentioned starting with user requirements and backgrounds, contribution to academic merits of researchers and standardisation. These are answered with existing technologies and procedures, referring to the Component Metadata Infrastructure (CMDI). CMDI provides an infrastructure and methods for adapting metadata to the requirements of specific classes of resources, using central registries for data categories, and metadata schemas. These registries allow for the definition of metadata schemas per resource type while reusing groups of data categories also used by other schemas. In summary, rules of best practice for the creation of metadata are given.
Wenn man verschiedenartige Forschungsdaten über Metadaten inhaltlich beschreiben möchte, sind bibliografische Angaben allein nicht ausreichend. Vielmehr benötigt man zusätzliche Beschreibungsmittel, die der Natur und Komplexität gegebener Forschungsressourcen Rechnung tragen. Verschiedene Arten von Forschungsdaten bedürfen verschiedener Metadatenprofile, die über gemeinsame Komponenten definiert werden. Solche Forschungsdaten können gesammelt (z.B. über OAI-PMH-Harvesting) und mittels Facetten-basierter Suche über eine einheitliche Schnittstelle exploriert werden. Der beschriebene Anwendungskontext kann über sprachwissenschaftliche Daten hinaus verallgemeinert werden.
XML has been designed for creating structured documents, but the information that is encoded in these structures are, by definition, out of scope for XML. Additional sources, normally not easily interpretable by computers, such as documentation are needed to determine the intention of specific tags in a tag-set. The Component Metadata Infrastructure (CMDI) takes a rather pragmatic approach to foster interoperability between XML instances in the domain of metadata descriptions for language resources. This paper gives an overview of this approach.
This chapter focuses on the contributions of German scholars to two of the three main research questions that have defined EU studies. Leaving aside the debate on the drivers of European integration, i.e. European integration theory, we will discuss the «governance turn» Fritz Scharpf, Beate Kohler-Koch, Arthur Benz, Ingeborg Tömmel and others promoted in studying EU institutions as well as the more policy-oriented approaches by Adrienne Héritier and again Fritz Scharpf and their students. We will then address the ever-growing literature on Europeanization on how EU policies, institutions and political processes have been affecting the domestic structures of member states, membership candidates, as well as neighborhood and third countries. In this context, German scholars also contributed to EU studies in what could be coined in methodological rather than substantial terms. Whereas Thomas König, Gerald Schneider, and others promoted the application of quantitative approaches, scientists like Bernhard Ebbinghaus and Markus Haverland dealt with general questions on research designs like case selection and causal inference. Finally, we will also discuss German contributions to diffusion research. The European Union as a most likely case for the diffusion of policies has attracted considerable attention by scholars dealing with the question of when and how policies spread across time and space. So it comes as no surprise that EU studies as well as diffusion research mutually benefitted from each other. In this regard, German scientists like Katharina Holzinger, Christoph Knill, Tanja Börzel, Thomas Plümper, Thomas Risse and others played a prominent role, too.
Mechanism-based thinking on policy diffusion. A review of current approaches in political science
(2011)
Despite theoretical and methodological progress in what is now coined as the third generation of diffusion studies, explicitly dealing with the causal mechanisms underlying diffusion processes and comparatively analyzing them is only of recent date. As a matter of fact, diffusion research has ended up in a diverse and often unconnected array of theoretical assumptions relying both on rational as well as constructivist reasoning – a circumstance calling for more theoretical coherence and consistency. Against this backdrop, this paper reviews and streamlines diffusion literature in political science. Diffusion mechanisms largely cluster around two causal arguments determining the desires and preferences of actors for choosing alternative policies. First, existing diffusion mechanisms accounts can be grouped according to the rationality for policy adoption, this means that government behavior is based on the instrumental considerations of actors or on constructivist arguments like norms and rule-driven actors. Second, diffusion mechanisms can either directly impact on the beliefs of actors or they might influence the structural conditions for decision-making. Following this logic, four basic diffusion mechanisms can be identified in mechanism-based thinking on policy diffusion: emulation, socialization, learning, and externalities.
This paper demonstrates systematic cross-linguistic differences in the electrophysiological correlates of conflicts between form and meaning (“semantic reversal anomalies”). These engender P600 effects in English and Dutch (e.g. Kolk et al., 2003, Kuperberg et al., 2003), but a biphasic N400 – late positivity pattern in German (Schlesewsky and Bornkessel-Schlesewsky, 2009), and monophasic N400 effects in Turkish (Experiment 1) and Mandarin Chinese (Experiment 2). Experiment 3 revealed that, in Icelandic, semantic reversal anomalies show the English pattern with verbs requiring a position-based identification of argument roles, but the German pattern with verbs requiring a case-based identification of argument roles. The overall pattern of results reveals two separate dimensions of cross-linguistic variation: (i) the presence vs. absence of an N400, which we attribute to cross-linguistic differences with regard to the sequence-dependence of the form-to-meaning mapping and (ii) the presence vs. absence of a late positivity, which we interpret as an instance of a categorisation-related late P300, and which is observable when the language under consideration allows for a binary well-formedness categorisation of reversal anomalies. We conclude that, rather than reflecting linguistic domains such as syntax and semantics, the late positivity vs. N400 distinction is better understood in terms of the strategies that serve to optimise the form-to-meaning mapping in a given language.
This paper is concerned with relative constructions in non-standard varieties of European languages, which will be analyzed on the basis of three typological parameters (word order, relative element, syntactic role of the relativized item). The validity of claims raised in studies on the areal distribution of relative constructions in Europe will be checked against the results of the analysis, so as to ascertain whether they still hold when non-standard varieties are examined.
Die Ordnung des öffentlichen Diskurses der Wirtschaftskrise und die (Un-)Ordnung des Ausgeblendeten
(2011)
Kollokationen stellen einen noch zu wenig beachteten Teilbereich der Phraseologie dar. Sie sind bisher in den Wörterbüchern unzureichend erfasst und werden auch nicht systematisch gelehrt und gelernt. Es lassen sich zwei Typen von Kollokationen unterscheiden, die sowohl eine strukturelle als auch eine statistische Herleitung nutzen und beide für die unauffällige und kompetente Sprachproduktion im Alltag von Bedeutung sind. Angesichts der großen Zahl der auffindbaren Kollokationen ist zu differenzieren und zu gewichten: a) Es steht lexikografisch zunächst die Unterstützung bei der Sprachproduktion im Vordergrund sowie b) der Grundwortschatz bzw. Basiswortschatz und es sind c) die Unterscheidung von typischen Wortverbindungen und gebräuchlichen Wortverbindungen ('den Hund loslassen' vs. 'den Hund anleinen/an die Leine nehmen') vorzunehmen.
This paper discusses the technological and methodological challenges in creating and sharing HAMATAC, the Hamburg Map Task Corpus. The first version of the corpus, consisting of 24 recordings with orthographic transcriptions and metadata, is publicly available. A second version featuring different types of linguistic annotation is in progress. I will describe how the various software tools and data formats of the EXMARaLDA system were used for transcription and multi-level annotation, to compile recordings and transcriptions into a corpus and manage metadata, to publish the corpus, and how they can be used for carrying out corpus queries (KWIC) and analyses. Some recurrent issues in corpus building and sharing and the interaction of technological and methodological aspects will be illustrated using HAMATAC.
Solo di recente le tipologie testuali turistiche, generi di testo che svolgono un ruolo molto importante nella comunicazione specialistica, sono diventate oggetto di interesse per gli studi linguistici. L’articolo presenta gli esiti di un’analisi contrastiva (tedesco-italiano) di cataloghi turistici dal punto di vista microstrutturale, con particolare attenzione per la sintassi, il lessico e i mezzi stilistici più frequentemente utilizzati. L’indagine palesa come i cataloghi siano una tipologia testuale che si presta a molteplici applicazioni sia in didattica che in lessicografia.
The planning of a dictionary should consider both theoretical and empiric aspects, either for its macro- and microstructure: this is true also for Online Specialized Dictionaries of Linguistics. In particular the microstructure should be standardized and structured so as to fit with the primary and secondary functions of a dictionary. Unfortunately, empirical studies that investigate Online Specialized Dictionaries of Linguistics are rare, making it unclear which microstructural elements are obligatory and which are facultative. This article will present and comment upon the results of an investigation into a corpus of Online Specialized Dictionaries of Linguistics, focusing attention on these aspects and also the most important theoretical issues. An example taken from DIL, a German-Italian Online Dictionary of Linguistics, will end the article.
DIL ist ein deutsch-italienisches Online-Fachwörterbuch der Linguistik. Es ist ein offenes Wörterbuch und mit diesem Beitrag wird für eine mögliche Zusammenarbeit, Kollaboration plädiert. DIL ist noch im Aufbau begriffen; zur Zeit ist nur die Sektion DaF komplett veröffentlicht, auch wenn andere Sektionen in Bearbeitung sind. Die Sektion LEX (Lexikographie), die zur Veröffentlichung ansteht, wird zusammen mit den wichtigsten Eigenschaften des Wörterbuches präsentiert.
Sentiment Analysis is the task of extracting and classifying opinionated content in natural language texts. Common subtasks are the distinction between opinionated and factual texts, the classification of polarity in opinionated texts, and the extraction of the participating entities of an opinion(-event), i.e. the source from which an opinion emanates and the target towards which it is directed. With the emerging Web 2.0 which describes the shift towards a highly user-interactive communication medium, the amount of subjective content on the World Wide Web is steadily increasing. Thus, there is a growing need for automatically processing this type of content which is provided by sentiment analysis. Both natural language processing, which is the task of providing computational methods for the analysis and representation of natural language, and machine learning, which is the task of building task-specific classification models on the basis of empirical data, may be instrumental in mastering the challenges of the automatic sentiment analysis of written text. Many problems in sentiment analysis have been proposed to be solved with machine learning methods exclusively using a fairly low-level feature design, such as bag of words, containing little linguistic information. In this thesis, we examine the effectiveness of linguistic features in various subtasks of sentiment analysis. Thus, we heavily draw from the insights gained by natural language processing. The application of linguistic features can be applied on various classification methods, be it in rule-based classification, where the linguistic features are directly encoded as a classifier, in supervised machine learning, where these features complement basic low-level features, or in bootstrapping methods, where these features form a rule-based classifier generating a labeled training set from which a supervised classifier can be trained. In this thesis, we will in particular focus on scenarios where the combination of linguistic features and machine learning methods is effective. We will look at common text classification tasks, both coarse-grained and fine-grained, and extraction tasks.
The study empirically examines the interpretation of focus accents in German. To this end, a methodology is developed, and it is discussed how experimental investigation can proceed at the current state of the focus theory. Methodologically, experiments directly measuring interpretation provide an alternative to the widespread practice of using only empirical preference and production data to investigate the interpretation of stimuli, and it is shown why such an alternative is necessary.
The empirical results show that one must extend and restrict theories assuming an association of free focus and scalar implicature (exhaustivity) or question–answer congruence as follows: On the one hand, situational factors in the interpretation must be taken into account to a greater extent than until now, especially their interaction with ‘physical’ properties of the speech signal (focus marking). On the other hand, a prototypical definition of Focus is called for which connects the major concepts of focus on the phonetic-phonological, semantic and information-structural levels and takes their prototypical coincidence to be the basis of focus interpretation and corresponding intuitions.
In this paper, we explore different linguistic structures encoded as convolution kernels for the detection of subjective expressions. The advantage of convolution kernels is that complex structures can be directly provided to a classifier without deriving explicit features. The feature design for the detection of subjective expressions is fairly difficult and there currently exists no commonly accepted feature set. We consider various structures, such as constituency parse structures, dependency parse structures, and predicate-argument structures. In order to generalize from lexical information, we additionally augment these structures with clustering information and the task-specific knowledge of subjective words. The convolution kernels will be compared with a standard vector kernel.
This article presents a revised version of GAT, a transcription system first devel-oped by a group of German conversation analysts and interactional linguists in 1998. GAT tries to follow as many principles and conventions as possible of the Jefferson-style transcription used in Conversation Analysis, yet proposes some conventions which are more compatible with linguistic and phonetic analyses of spoken language, especially for the representation of prosody in talk-in-interaction. After ten years of use by researchers in conversation and discourse analysis, the original GAT has been revised, against the background of past experience and in light of new necessities for the transcription of corpora arising from technologi-cal advances and methodological developments over recent years. The present text makes GAT accessible for the English-speaking community. It presents the GAT 2 transcription system with all its conventions and gives detailed instructions on how to transcribe spoken interaction at three levels of delicacy: minimal, basic and fine. In addition, it briefly introduces some tools that may be helpful for the user: the German online tutorial GAT-TO and the transcription editing software FOLKER.
In order to automatically extract opinion holders, we propose to harness the contexts of prototypical opinion holders, i.e. common nouns, such as experts or analysts, that describe particular groups of people whose profession or occupation is to form and express opinions towards specific items. We assess their effectiveness in supervised learning where these contexts are regarded as labelled training data and in rule-based classification which uses predicates that frequently co-occur with mentions of the prototypical opinion holders. Finally, we also examine in how far knowledge gained from these contexts can compensate the lack of large amounts of labeled training data in supervised learning by considering various amounts of actually labeled training sets.
In this paper, we investigate the role of predicates in opinion holder extraction. We will examine the shape of these predicates, investigate what relationship they bear towards opinion holders, determine what resources are potentially useful for acquiring them, and point out limitations of an opinion holder extraction system based on these predicates. For this study, we will carry out an evaluation on a corpus annotated with opinion holders. Our insights are, in particular, important for situations in which no labelled training data are available and only rule-based methods can be applied.
Phänomene im Bereich von Valenz, Argumentstruktur, Diathesen, Kollokationen und Phrasemen dienen von jeher zur Bestimmung der Schnittstelle zwischen Lexikon und Grammatik. Mittlerweile sind allerdings grundsätzliche Zweifel an der Berechtigung der sprachtheoretischen Zweiteilung in Lexikon und Grammatik aufgekommen, auch weil die Entwicklungen im Bereich empirischer Methodik einen zunehmend besseren Einblick in die differenzierte Natur sprachlichen Wissens ermöglichen und uns mit semiproduktiven Prozessen, graduellen Kategoriezuordnungen, instabilen sprachlichen Mustern und frequenzgesteuerten Usualisierungen eigentlich regelhafter Strukturen konfrontieren. Die strikte Grenze zwischen der Grammatik als dem Ort des syntaktisch-semantisch Regelhaften und dem Lexikon als dem Repositorium des syntaktisch-semantisch Idiosynkratischen ist damit in Frage gestellt. Die Beiträge des Bandes betrachten den Bereich, wo Regelhaftes und Idiosynkratisches miteinander verwoben sind, sie führen Kontroversen zum Status von Konstruktionen und dem Verhältnis zwischen Lexikon und Grammatik, und sie zeigen, wie empirische Methoden der Korpuslinguistik, Psycho- und Neurolinguistik und Spracherwerbsforschung zur Klärung dieser Kontroversen beitragen.
Wörter und Unwörter
(2011)
„Keine Angst vor Anglizismen“ sagt Gerhard Stickel und zeigt, dass und wie der Wortschatz der deutschen Sprache sich ständig erneuert, nicht nur durch die Entlehnung von Wörtern aus anderen Sprachen, sondern mehr noch durch die Bildung neuer Wörter aus vorhandenen eigenen Wörtern und Wortteilen. So wie die alte Vorliebe für Wörter aus dem Französischen sich überlebt hat, sind auch viele Anglizismen schon wieder untergegangen, und ihr übermäßiger Gebrauch sagt weniger über die Sprache als über ihre Sprecher – ganz abgesehen davon, dass das Deutsche nicht nur Wörter aus anderen Sprachen aufnimmt, sondern auch seinerseits Wörter an andere Sprachen abgibt. Die deutsche Sprache wird zum Faszinosum in diesem Vortrag des ehemaligen Direktors des Instituts für Deutsche Sprache.
Der Beitrag bietet einen Analysevorschlag für ereignisbezogene adverbiale Modifikatoren beim Zustandspassiv, der ihr Auftreten in verbaler Umgebung mit der adjektivischen Natur des Zustandspassivs in Einklang bringt. Grundlage hierfür ist eine empirisch breit abgesicherte Argumentation für das Vorliegen einer besonderen strukturellen Nähe zwischen Modifikator und Partizip, die gleichsam eine kompakte Einheit in der Übergangszone zwischen Wort und Phrase bilden. Diese Sicht macht den Weg frei für eine strikt kompositionale Semantik des Zustandspassivs samt adverbialen Modifikatoren.
We introduce a system that learns the participants of arbitrary given scripts. This system processes data from web experiments, in which each participant can be realized with different expressions. It computes participants by encoding semantic similarity and global structural information into an Integer Linear Program. An evaluation against a gold standard shows that we significantly outperform two informed baselines.
Nach Aufrufen der Zarin Katharina II und ihrer Nachfolger haben sich viele Menschen „aus deutschen Landen“ – aus Hessen und Baden, aus der Pfalz und Württemberg, aus Bayern, aus Mittel- und Norddeutschland – im 18. und später im 19. Jahrhundert auf den Weg nach Russland gemacht. Mitnehmen konnten sie nicht viel – außer ihren Heimatmundarten. Diese haben sie nicht nur in den ersten Jahrzehnten bewahrt, sondern für viele Generationen und Jahrhunderte danach.
Vom Zarenreich bis Putin folgt die Autorin dem Schicksal der russlanddeutschen Dialekte. Sie reist in die entlegensten Winkel der ehemaligen Sowjetunion, in die kleinen und großen Sprachinseln, besucht Wolhyniendeutsche und Mennoniten im Norden, Schwaben in Kasachstan, Bayern und Pfälzer im Altai-Gebiet und entdeckt überall quicklebendige Mundarten, eine reiche, vielfältige, für die Außenwelt noch weitgehend verschlossene Dialektlandschaft, deren besonderer Reiz das Neben- und Miteinander des Ursprünglichen, Mitgebrachten und des in den russischen Weiten Neuentwickelten und Hinzugekommenen ausmacht. Einen allgemeinen und gleichzeitig detaillierten Einblick in die heute weitgehend verschwundenen deutschen Sprachinselgebiete Russlands und deren Mundarten gibt das gut illustrierte Buch von Nina Berend.
Semantic argument structures are often incomplete in that core arguments are not locally instantiated. However, many of these implicit arguments can be linked to referents in the wider context. In this paper we explore a number of linguistically motivated strategies for identifying and resolving such null instantiations (NIs). We show that a more sophisticated model for identifying definite NIs can lead to noticeable performance gains over the state-of-the- art for NI resolution.