Refine
Year of publication
- 2011 (104) (remove)
Document Type
- Part of a Book (64)
- Article (18)
- Conference Proceeding (14)
- Other (5)
- Book (1)
- Report (1)
- Working Paper (1)
Language
- German (79)
- English (22)
- French (1)
- Multiple languages (1)
- Russian (1)
Is part of the Bibliography
- no (104)
Keywords
- Deutsch (56)
- Korpus <Linguistik> (15)
- Grammatik (12)
- Konstruktionsgrammatik (9)
- Computerlinguistik (8)
- Computerunterstützte Lexikographie (8)
- Sprachvariante (8)
- Konversationsanalyse (6)
- Online-Wörterbuch (6)
- Englisch (5)
Publicationstate
- Veröffentlichungsversion (104) (remove)
Reviewstate
- (Verlags)-Lektorat (70)
- Peer-Review (22)
- Verlags-Lektorat (6)
- (Verlags-)Lektorat (2)
- Peer-review (2)
Publisher
- de Gruyter (25)
- Narr (18)
- Lang (8)
- Institut für Deutsche Sprache (3)
- Francke (2)
- Hempen (2)
- Universidade de Santiago de Compostela (2)
- Aletheia (1)
- Association for Computational Linguistics (1)
- Benjamins (1)
This article looks at Latgalian from a perspective of a classification of languages. It starts by discussing relevant terms relating to sociolinguistic language types. It argues that Latgalian and its speakers show considerable similarities with many languages in Europe which are considered to be regional languages – hence, also Latgalian should be classified as such. In a second part, the article uses sociolinguistic data to indicate that the perceptions of speakers confirm this classification. Therefore, Latgalian should also officially be treated with the respect that other regional languages in Europe enjoy.
An interactive, dynamic electronic dictionary aimed at text production should guide the user in innovative ways, especially in respect of difficult, complicated or confusing issues. This paper proposes a design for bilingual dictionaries intended to guide users in text production; we focus on complex phenomena of the interaction between lexis and grammar. It will be argued that a dictionary aimed at guiding the user in lexical selection should implement a type of “decision algorithm”. In addition, it should flag incorrect solutions and should warn against possible wrong generalisations of (foreign) language learners. Our proposals will be illustrated with examples from several languages, as the design principles are generally applicable. The copulative construction which is regarded as the most complicated grammatical structure in Northern Sotho will be analyzed in more detail and presented as a case in point.
Between classical symbolic word sense disambiguation (wsd) using explicit deep semantic representations of sentences and texts and statistical wsd using word co-occurrence information, there is a recent tendency towards mediating methods. Similar to so-called lightweight semantics (Marek, 2009) we suggest to only make sparse use of semantic information. We describe an approximation model based upon flat underspecified discourse representation structures (FUDRSs, cf. Eberle, 2004) that weighs knowledge about context structure, lexical semantic restrictions and interpretation preferences. We give a catalogue of guidelines for human annotation of texts by corresponding indicators. Using this, the reliability of an analysis tool that implements the model can be tested with respect to annotation precision and disambiguation prediction and how both can be improved by bootstrapping the knowledge of the system using corpus information. For the balanced test corpus considered the recognition rate of the preferred reading is 80-90% (depending on the smoothing of parse errors).
The article aims to examine grammatical features and pragmatic concerns of communicating in the Sciences. In the research of certain languages, it became common to explaingrammatical features such as the usage of passive voice and nominal structures by communication requirements such as objectivity and precision. With the assumption that communication in Science is designed to help gain and spread new insight, the authors tried to integrateseveral approaches to pragmatic and grammatical features of communication. By discussing therelationship between the grammar of certain languages and of the corresponding commonlanguage, the article also places the subject of communication in the Sciences in the discipline oflanguage Variation.
The Lyon’s team research task consists in the study of the way in which multilingual resources are mobilized in team work within collaborative activities; how they are exploited in a specific way in order both to enhance collaboration and to respect the specificities of the members’ linguistic competences and practices within the team. Central to our analytical work, which is inspired by ethnomethodological conversation analysis, is the relationship between multilingual resources and the situated organization of linguistic uses and of social practices.
This paper aims at contributing to the analysis of overlaps in turns-at-talk from both a sequential and a multimodal perspective. Overlaps have been studied within Conversation Analysis by focusing mainly on verbal and vocal resources; taking into account multimodal resources such as gesture, bodily posture, and gaze contributes to a better understanding of participants’ orientations to the sequential organization of overlapping talk and their management of speakership. First, we introduce the way in which overlaps have been studied in Conversation Analysis, mainly by Jefferson (1973, 1983, 2004) and Schegloff (2000); then we propose possible implications of their multimodal analysis. In order to demonstrate that speakers systematically orient to the overlap onset and resolution we analyze the multimodal conduct of overlapped speakers. Findings show methodical variations in trajectories of overlap resolution: speakers’ gestures in overlap display themselves as maintaining or withdrawing their turn, thereby exhibiting the speakership achieved and negotiated during overlap.
This paper offers a detailed analysis of the opening of an international meeting. English Lingua Franca as the official language of the meeting is actively discussed and negotiated by the participants. The analysis highlights the issues identified by the participants themselves in choosing a linguistic regime for their professional exchanges. The English Lingua Franca regime is aimed at facilitating the participation of some of the participants, but creates problems for others, too. The chairman deals with this situation in an embodied way (through his gaze, gesture, bodily postures, and by the way in which he walks through the room), displaying that he orients to different member categories (such as 'anglophone', 'anglophone who can understand French', 'francophile', etc.) as benefitting from or resisting against the definitive language choice.
Linguistics is facing the challenge of many other sciences as it continues to grow into increasingly complex subfields, each with its own separate or overarching branches. While linguists are certainly aware of the overall structure of the research field, they cannot follow all developments other than those of their subfields. It is thus important to help specialists but also newcomers alike to bushwhack through evolved or unknown territory of linguistic data. A considerable amount of research data in linguistics is described with metadata. While studies described and published in archived journals and conference proceedings receive a quite homogeneous set of metadata tags — e.g., author, title, publisher —, this does not hold for the empirical data and analyses that underlie such studies. Moreover, lexicons, grammars, experimental data, and other types of resources come in different forms; and to make things worse, their description in terms of metadata is also not uniform, if existing at all. These problems are well-known and there are now a number of international initiatives — e.g., CLARIN, FlareNet, MetaNet, DARIAH — to build infrastructures for managing linguistic resources. The NaLiDa project, funded by the German Research Foundation, aims at facilitating the management and access to linguistic resources originating from German research institutions. In cooperation with the German SFB 833 research center, we are developing a combination of faceted and full-text search to give integrated access through heterogeneous metadata sets. Our approach is supported by a central registry for metadata field descriptors, and a component repository for structured groups of data categories as larger building blocks.
This paper uses a devil’s advocate position to highlight the benefits of metadata creation for linguistic resources. It provides an overview of the required metadata infrastructure and shows that this infrastructure is in the meantime developed by various projects and hence can be deployed by those working with linguistic resources and archiving. Possible caveats of metadata creation are mentioned starting with user requirements and backgrounds, contribution to academic merits of researchers and standardisation. These are answered with existing technologies and procedures, referring to the Component Metadata Infrastructure (CMDI). CMDI provides an infrastructure and methods for adapting metadata to the requirements of specific classes of resources, using central registries for data categories, and metadata schemas. These registries allow for the definition of metadata schemas per resource type while reusing groups of data categories also used by other schemas. In summary, rules of best practice for the creation of metadata are given.
Wenn man verschiedenartige Forschungsdaten über Metadaten inhaltlich beschreiben möchte, sind bibliografische Angaben allein nicht ausreichend. Vielmehr benötigt man zusätzliche Beschreibungsmittel, die der Natur und Komplexität gegebener Forschungsressourcen Rechnung tragen. Verschiedene Arten von Forschungsdaten bedürfen verschiedener Metadatenprofile, die über gemeinsame Komponenten definiert werden. Solche Forschungsdaten können gesammelt (z.B. über OAI-PMH-Harvesting) und mittels Facetten-basierter Suche über eine einheitliche Schnittstelle exploriert werden. Der beschriebene Anwendungskontext kann über sprachwissenschaftliche Daten hinaus verallgemeinert werden.
XML has been designed for creating structured documents, but the information that is encoded in these structures are, by definition, out of scope for XML. Additional sources, normally not easily interpretable by computers, such as documentation are needed to determine the intention of specific tags in a tag-set. The Component Metadata Infrastructure (CMDI) takes a rather pragmatic approach to foster interoperability between XML instances in the domain of metadata descriptions for language resources. This paper gives an overview of this approach.
Mechanism-based thinking on policy diffusion. A review of current approaches in political science
(2011)
Despite theoretical and methodological progress in what is now coined as the third generation of diffusion studies, explicitly dealing with the causal mechanisms underlying diffusion processes and comparatively analyzing them is only of recent date. As a matter of fact, diffusion research has ended up in a diverse and often unconnected array of theoretical assumptions relying both on rational as well as constructivist reasoning – a circumstance calling for more theoretical coherence and consistency. Against this backdrop, this paper reviews and streamlines diffusion literature in political science. Diffusion mechanisms largely cluster around two causal arguments determining the desires and preferences of actors for choosing alternative policies. First, existing diffusion mechanisms accounts can be grouped according to the rationality for policy adoption, this means that government behavior is based on the instrumental considerations of actors or on constructivist arguments like norms and rule-driven actors. Second, diffusion mechanisms can either directly impact on the beliefs of actors or they might influence the structural conditions for decision-making. Following this logic, four basic diffusion mechanisms can be identified in mechanism-based thinking on policy diffusion: emulation, socialization, learning, and externalities.
This paper is concerned with relative constructions in non-standard varieties of European languages, which will be analyzed on the basis of three typological parameters (word order, relative element, syntactic role of the relativized item). The validity of claims raised in studies on the areal distribution of relative constructions in Europe will be checked against the results of the analysis, so as to ascertain whether they still hold when non-standard varieties are examined.
Die Ordnung des öffentlichen Diskurses der Wirtschaftskrise und die (Un-)Ordnung des Ausgeblendeten
(2011)
In this paper, we explore different linguistic structures encoded as convolution kernels for the detection of subjective expressions. The advantage of convolution kernels is that complex structures can be directly provided to a classifier without deriving explicit features. The feature design for the detection of subjective expressions is fairly difficult and there currently exists no commonly accepted feature set. We consider various structures, such as constituency parse structures, dependency parse structures, and predicate-argument structures. In order to generalize from lexical information, we additionally augment these structures with clustering information and the task-specific knowledge of subjective words. The convolution kernels will be compared with a standard vector kernel.
In order to automatically extract opinion holders, we propose to harness the contexts of prototypical opinion holders, i.e. common nouns, such as experts or analysts, that describe particular groups of people whose profession or occupation is to form and express opinions towards specific items. We assess their effectiveness in supervised learning where these contexts are regarded as labelled training data and in rule-based classification which uses predicates that frequently co-occur with mentions of the prototypical opinion holders. Finally, we also examine in how far knowledge gained from these contexts can compensate the lack of large amounts of labeled training data in supervised learning by considering various amounts of actually labeled training sets.
Phänomene im Bereich von Valenz, Argumentstruktur, Diathesen, Kollokationen und Phrasemen dienen von jeher zur Bestimmung der Schnittstelle zwischen Lexikon und Grammatik. Mittlerweile sind allerdings grundsätzliche Zweifel an der Berechtigung der sprachtheoretischen Zweiteilung in Lexikon und Grammatik aufgekommen, auch weil die Entwicklungen im Bereich empirischer Methodik einen zunehmend besseren Einblick in die differenzierte Natur sprachlichen Wissens ermöglichen und uns mit semiproduktiven Prozessen, graduellen Kategoriezuordnungen, instabilen sprachlichen Mustern und frequenzgesteuerten Usualisierungen eigentlich regelhafter Strukturen konfrontieren. Die strikte Grenze zwischen der Grammatik als dem Ort des syntaktisch-semantisch Regelhaften und dem Lexikon als dem Repositorium des syntaktisch-semantisch Idiosynkratischen ist damit in Frage gestellt. Die Beiträge des Bandes betrachten den Bereich, wo Regelhaftes und Idiosynkratisches miteinander verwoben sind, sie führen Kontroversen zum Status von Konstruktionen und dem Verhältnis zwischen Lexikon und Grammatik, und sie zeigen, wie empirische Methoden der Korpuslinguistik, Psycho- und Neurolinguistik und Spracherwerbsforschung zur Klärung dieser Kontroversen beitragen.