Refine
Year of publication
- 2012 (272) (remove)
Document Type
- Part of a Book (120)
- Article (82)
- Conference Proceeding (35)
- Book (19)
- Part of Periodical (11)
- Doctoral Thesis (2)
- Other (2)
- Review (1)
Keywords
- Deutsch (118)
- Korpus <Linguistik> (28)
- Konversationsanalyse (19)
- Computerlinguistik (16)
- Englisch (11)
- Sprachgebrauch (11)
- Interaktion (10)
- Kontrastive Grammatik (10)
- Deutschland (9)
- Diskursanalyse (9)
Publicationstate
- Veröffentlichungsversion (102)
- Zweitveröffentlichung (23)
- Postprint (15)
Reviewstate
Publisher
- de Gruyter (37)
- Institut für Deutsche Sprache (31)
- Narr (17)
- European Language Resources Association (8)
- Lang (8)
- De Gruyter (7)
- European Language Resources Association (ELRA) (5)
- Verl. für Gesprächsforschung (5)
- Akademie Verlag (4)
- Springer (4)
The changes caused by the growing automatisation of processes in the lexicographer´s workstation and in lexicographic work, together with the ensuing needs of lexicographers and their demands for adequately targeted software, have not been discussed sufficiently in meta-lexicographic research. The aim of this paper is therefore to fill this gap, with a focus on academic non-commercial lexicography. After an introduction into the general functionalities of specific dictionary writing software, with the help of a real-life example we will discuss the lexicographic working environment, the new specific demands to lexicographic software as well as different tools. The final aim is to propose some recommendations for how to structure the lexicographic working environment to meet specific project requirements.
A frequently replicated finding is that higher frequency words tend to be shorter and contain more strongly reduced vowels. However, little is known about potential differences in the articulatory gestures for high vs. low frequency words. The present study made use of electromagnetic articulography to investigate the production of two German vowels, [i] and [a], embedded in high and low frequency words. We found that word frequency differently affected the production of [i] and [a] at the temporal as well as the gestural level. Higher frequency of use predicted greater acoustic durations for long vowels; reduced durations for short vowels; articulatory trajectories with greater tongue height for [i] and more pronounced downward articulatory trajectories for [a]. These results show that the phonological contrast between short and long vowels is learned better with experience, and challenge both the Smooth Signal Redundancy Hypothesis and current theories of German phonology.
The present article describes the first stage of the KorAP project, launched recently at the Institut für Deutsche Sprache (IDS) in Mannheim, Germany. The aim of this project is to develop an innovative corpus analysis platform to tackle the increasing demands of modern linguistic research. The platform will facilitate new linguistic findings by making it possible to manage and analyse primary data and annotations in the petabyte range, while at the same time allowing an undistorted view of the primary linguistic data, and thus fully satisfying the demands of a scientific tool. An additional important aim of the project is to make corpus data as openly accessible as possible in light of unavoidable legal restrictions, for instance through support for distributed virtual corpora, user-defined annotations and adaptable user interfaces, as well as interfaces and sandboxes for user-supplied analysis applications. We discuss our motivation for undertaking this endeavour and the challenges that face it. Next, we outline our software implementation plan and describe development to-date.
The present contribution addresses an infrastructural issue of universal relevance, addressed in the specific context of the TEI. We describe a combination of open-source tools and an open-access approach to creating knowledge repositories that have been employed in building a bibliographic reference library for the “TEI for Linguists” special interest group (LingSIG). The authors argue that, for an initiative such as the TEI, it is important to choose open, freely available solutions. If these solutions have the advantage of attracting new users and promoting the initiative itself, so much the better, especially if it is done in a non-committal way: no one using the LingSIG bibliographic repository has to be a member of the LingSIG or a “TEI-er” in general.
The paper presents an XML schema for the representation of genres of computer-mediated communication (CMC) that is compliant with the encoding framework defined by the TEI. It was designed for the annotation of CMC documents in the project Deutsches Referenzkorpus zur internetbasierten Kommunikation (DeRiK), which aims at building a corpus on language use in the most popular CMC genres on the German-speaking Internet. The focus of the schema is on those CMC genres which are written and dialogic―such as forums, bulletin boards, chats, instant messaging, wiki and weblog discussions, microblogging on Twitter, and conversation on “social network” sites.
The schema provides a representation format for the main structural features of CMC discourse as well as elements for the annotation of those units regarded as “typical” for language use on the Internet. The schema introduces an element <posting>, which describes stretches of text that are sent to the server by a user at a certain point in time. Postings are the main constituting elements of threads and logfiles, which, in our schema, are the two main types of CMC macrostructures. For the microlevel of CMC documents (that is, the structure of the <posting> content), the schema introduces elements for selected features of Internet jargon such as emoticons, interaction words and addressing terms. It allows for easy anonymization of CMC data for purposes in which the annotated data are made publicly available and includes metadata which are necessary for referencing random excerpts from the data as references in dictionary entries or as results of corpus queries.
Documentation of the schema as well as encoding examples can be retrieved from the web at http://www.empirikom.net/bin/view/Themen/CmcTEI. The schema is meant to be a core model for representing CMC that can be modified and extended by others according to their own specific perspectives on CMC data. It could be a first step towards an integration of features for the representation of CMC genres into a future new version of the TEI Guidelines.
Zur Erforschung der generationsbedingten Variation im pfälzischen Sprachinseldialekt am Niederrhein
(2012)
Dieses Papier diskutiert informationsstrukturelle Aspekte der mehrfachen Vorfeldbesetzung im Deutschen. Auf der Grundlage einer größtenteils aus den IDS-Korpora extrahierten Belegsammlung werden Diskursgegebenheit, Fokus- und Topikstatus (vor allem) des Vorfeldmaterials beschrieben und in Bezug zu entsprechenden Aussagen in der Literatur gesetzt. Neben informationsstrukturellen Faktoren werden im letzten Abschnitt mögliche weitere Faktoren angesprochen, die mehrfache Vorfeldbesetzung favorisieren könnten. Zudem werden für einen begrenzten Ausschnitt des Deutschen erstmals Zahlen vorgelegt, die das Verhältnis von mehrfacher Vorfeldbesetzung zur ähnlichen, aber als „kanonischer“ geltenden Besetzung des Vorfelds mit einer (möglicherweise partiellen) Verbalphrase illustrieren.
This article deals with three interrelated phenoma in the information structure of German sentences: the focusing of negative markers, of finite verb forms and of the particles ja, doch, wohl and schon. Focusing of the finite verb is the most important marker of verum focus, as described by Höhle (1988). Focusing of particles can be an alternative means for similar purposes, while focusing of negation seems to be the contradictory opposite of verum focus. It is shown that negation- independently of its information structural status - can be interpreted on three distinct levels of sentence meaning: as an indicator of the non-facticity of a state of affairs, the non-truth of a proposition, or the non-desirability of a speech act. Focusing of the negative marker puts contrastive emphasis on the negative value assigned to sentence meaning on one of these levels. Ve rum focus can be interpreted on the same three levels: as a marker of contrastive emphasis on a positive value of facticity, truth or desirability. The particles ja, doch, wohl and schon refer to sufficient epistemic or interactional conditions for the assignment of a positive or negative value. By focusing such a particle, the speaker indicates that (s)he believes the assigned value to be well justified and insists on establishing it as common ground for further interaction.
Der Aufsatz entwirft eine Zusammenschau der Verknüpfungseigenschaften der Satzkonnektoren des Deutschen und eine Terminologie für ihre Beschreibung. Zur Illustration dient eine Auswahl von 24 Kausal- und Konsekutivkonnektoren. In der ersten Hälfte geht es um semantische und syntaktische Eigenschaften sowie um Eigenschaften der Syntax-Semantik-Schnittstelle. In der zweiten Hälfte stehen diskurs- und informationsstrukturelle Eigenschaften im Vordergrund. Es zeigt sich, dass die beschriebenen Verknüpfungseigenschaften sich nicht beliebig miteinander kombinieren, sondern charakteristische Eigenschaftsprofile bilden, mit deren Hilfe sich fünf große Konnektorklassen definieren und als geordnetes Teilsystem der Grammatik darstellen lassen.
Der vorliegende Beitrag untersucht die grammatische Realisierung satzförmiger und satzwertiger Verbgruppen- und Satzadverbialia im Deutschen im Vergleich mit den romanischen Sprachen Italienisch und Portugiesisch (schwerpunktmäßig in der brasilianischen Varietät). Solche Adverbialia können formal recht unterschiedlich realisiert werden. Für das Deutsche sind finite, subjunktor-eingeleitete adverbiale Nebensätze typisch. Seltener sind uneingeleitete finite Nebensätze, Partizipialgruppen und durch eine Präposition eingeleitete Infinitivgruppen. In den romanischen Sprachen werden Gerundial-, Partizipial- und Infinitivgruppen deutlich häufiger als Adverbialia genutzt. Anders als im Deutschen können sie auch eigene Subjekte haben, wodurch sie finiten Nebensätzen ähnlicher werden.
Für die Grammatikschreibung des Deutschen ist die Negation eine Herausforderung. Das betrifft schon das Inventar der Negationsausdrücke wie nicht, kein oder niemand. In welchem Verhältnis stehen sie zueinander, und wann wird welcher Negationsausdruck gewählt? Die Negationspartikel nicht kann in den meisten Sätzen unterschiedliche Stellungen einnehmen, womit subtile Bedeutungsunterschiede einhergehen. Welchen genauen syntaktischen Status nicht hat, ist bis heute umstritten. Die Negation interagiert auch eng mit der Informationsstruktur, die unter anderem durch Intonation und Akzentuierung ausgedrückt wird. Die Intonation negierter Äußerungen und ihre Auswirkungen auf die Bedeutung werden in diesem Buch besonders gründlich behandelt. Schließlich sind zur Bedeutung der Negation selbst noch wichtige Fragen zu klären, unter anderem die, welche semantischen Objekte überhaupt negiert werden können und was genau durch ihre Negation bewirkt wird.
Das Buch versucht eine Gesamtschau der Grammatik der Negation im Deutschen, die für Fachwissenschaftler, für Studierende und für allgemein Sprachinteressierte, etwa für Lehrende des Deutschen als Mutter- und Fremdsprache, zugänglich sein soll. Die begrifflichen und methodischen Voraussetzungen aller Teile werden leserfreundlich eingeführt. Dadurch ist das Buch auch als Lehrwerk für die Gebiete Syntax, Informationsstruktur und Satzsemantik des Deutschen im Linguistikstudium verwendbar.
Einleitung
(2012)
Proceeding from the central ideas of the papers contained in this volume, the closing article sets out to achieve a unified theory of the syntax and semantics of verum focus, to be illustrated for the sentence and clause types of present day German. In German, verum focus is realized phonologically by means of pitch accents on morphosyntactic exponents of various classes: finite verb forms, complementizers and subordinators, interrogative and relative phrases, and modal particles. In the first half of the article, these constituents - most of which reside in the left periphery of the sentence or clause - are shown to share the gramma-tical function of distinguishing between sentence moods and other categories of clauses. This observation gives rise to the assumption that verum focus should be explicable as contrastive focus on semantically distinctive features or components of sentence mood and clause type. In the second half of the article this assumption is spelt out for the sentence and clause types of German. We propose a universal semantic structure of sentence meaning which makes it possible to reduce the most typical cases of verum focus and their diverse contextual interpretations to highlighting the connection between the sentence/clause and its textual or dis-course environment. This connection is syntactically implemented by an element occupying the head position of CP: either a finite verb form or a complementizer/subordinator. Realizations of verum focus on prefield constituents in wh- and relative clauses are explained as phonetic remedies deployed when a connecting element in C° is missing. Focusing of modal particles in the middle field and of verb forms in the right periphery of the clause are shown to differ semantically from verum focus stricto sensu, although they have similar pragmatic effects. The theory is built exclusively on assumptions needed for independent reasons and dispenses with the problematic verum operator assumed in most traditional accounts.
Der Aufsatz greift das Thema der Syntax und Semantik deutscher und italienischer Subjunktoren am Beispiel von während und mentre auf. Er entstand im Rahmen eines Kooperationsprojekts zwischen dem Institut für Deutsche Sprache Mannheim und dem Dipartimento di Studi Umanistici der Università del Piemonte Orientale in Vercelli. Ziel des Projekts ist die vergleichende Beschreibung syntaktischer, semantischer und text- bzw. informationsstruktureller Eigenschaften von Satzkonnektoren.
A formal narrative representation is a procedure assigning a formal description to a natural language narrative. One of the goals of the computational models of narrative community is to understand this procedure better in order to automatize it. A formal framework fit for automatization should allow for objective and reproducible representations. In this paper, we present empirical work focussing on objectivity and reproducibility of the formal framework by Vladimir Propp (1928). The experiments consider Propp’s formalization of Russian fairy tales and formalizations done by test subjects in the same formal framework; the data show that some features of Propp’s system such as the assignment of the characters to the dramatis personae and some of the functions are not easy to reproduce.
This paper presents the application of the <tiger2/> format to various linguistic scenarios with the aim of making it the standard serialisation for the ISO 24615 [1] (SynAF) standard. After outlining the main characteristics of both the SynAF metamodel and the <tiger2/> format, as extended from the initial Tiger XML format [2], we show through a range of different language families how <tiger2/> covers a variety of constituency and dependency based analyses.
This paper describes the status of the standardization efforts of a Component Metadata approach for describing Language Resources with metadata. Different linguistic and Language & Technology communities as CLARIN, META-SHARE and NaLiDa use this component approach and see its standardization of as a matter for cooperation that has the possibility to create a large interoperable domain of joint metadata. Starting with an overview of the component metadata approach together with the related semantic interoperability tools and services as the ISOcat data category registry and the relation registry we explain the standardization plan and efforts for component metadata within ISO TC37/SC4. Finally, we present information about uptake and plans of the use of component metadata within the three mentioned linguistic and L&T communities.
The paper’s purpose is to give an overview of the work on the Component Metadata Infrastructure (CMDI) that was implemented in the CLARIN research infrastructure. It explains, the underlying schema, the accompanying tools and services. It also describes the status and impact of the CMDI developments done within the CLARIN project and past and future collaborations with other projects.
Although most of the relevant dictionary productions of the recent past have relied on digital data and methods, there is little consensus on formats and standards. The Institute for Corpus Linguistics and Text Technology (ICLTT) of the Austrian Academy of Sciences has been conducting a number of varied lexicographic projects, both digitising print dictionaries and working on the creation of genuinely digital lexicographic data. This data was designed to serve varying purposes: machine-readability was only one. A second goal was interoperability with digital NLP tools. To achieve this end, a uniform encoding system applicable across all the projects was developed. The paper describes the constraints imposed on the content models of the various elements of the TEI dictionary module and provides arguments in favour of TEI P5 as an encoding system not only being used to represent digitised print dictionaries but also for NLP purposes.
Die genaue Charakterisierung der möglichen Wechselwirkungen zwischen Syntax und Morphologie stellt eine der zentralen Forschungsfragen der Sprachwissenschaft dar. Die hier betrachteten Verschmelzungsformen bieten sich als Fallstudie für die Syntax-Morphologie-Schnittstelle an, da Verschmelzungsformen von Präposition und Artikel wie „du“/ „au“ im Französischen oder „am“/ „zum“ im Deutschen paradigmatisch Sequenzen gegenüberstehen, in denen eine nicht-reduzierte Präposition mit einem vollen Artikel kombiniert wird („de la“/ „à la“; bzw. „an dem“/ „zu dem“). Für die Analyse dieser Formen muss also untersucht werden, inwiefern die Verwendung von Verschmelzungsformen gegenüber unreduzierten Abfolgen Änderungen in der Syntax nach sich zieht. In diesem Beitrag werde ich zeigen, dass die Wechselbeziehungen zwischen Verschmelzungsform und Syntax im Französischen und Deutschen unterschiedlicher Natur sind. Französische und deutsche Verschmelzungsformen unterscheiden sich in ihren morphologischen, semantischen und syntaktischen Eigenschaften. Hier sollen zwei Eigenschaften genauer untersucht werden: (i) die Kombinierbarkeit von Verschmelzungsformen und Nominalphrasen mit restriktiven Relativsätzen im Deutschen („?im/in dem Haus, das gerade renoviert wird“), und (ii) die Koordinationsmöglichkeiten von Präpositionalphrasen mit Verschmelzungsformen im Französischen und im Deutschen. Es ist bekannt, dass Verschmelzungsformen im Französischen die Koordinationsmöglichkeiten der beteiligten Nominalphrasen einschränken. Vergleichbare Wechselwirkungen zwischen Verschmelzungsform und Koordination sind im Deutschen jedoch nicht zu beobachten, wie anhand von Koordinationsdaten aus dem COSMAS II-Korpus belegt werden kann.
In meiner 2010 erschienenen Dissertation „Migration, Sprache und Rassismus“ habe ich mit ethnografischen, gesprächsanalytischen und -rhetorischen Methoden den Kommunikationsstil von zwei akademischen Migrantenmilieus(„emanzipatorische Migranten“ und „akademische Europatürken“) in Deutschland untersucht. Die Studie war Teil des Projekts „Deutschtürkische Sprachvariation und die Herausbildung kommunikativer Stile in dominant türkischen Migrantengruppen“, das am Institut für Deutsche Sprache durchgeführt wurde.
In this paper, we describe MLSA, a publicly available multi-layered reference corpus for German-language sentiment analysis. The construction of the corpus is based on the manual annotation of 270 German-language sentences considering three different layers of granularity. The sentence-layer annotation, as the most coarse-grained annotation, focuses on aspects of objectivity, subjectivity and the overall polarity of the respective sentences. Layer 2 is concerned with polarity on the word- and phrase-level, annotating both subjective and factual language. The annotations on Layer 3 focus on the expression-level, denoting frames of private states such as objective and direct speech events. These three layers and their respective annotations are intended to be fully independent of each other. At the same time, exploring for and discovering interactions that may exist between different layers should also be possible. The reliability of the respective annotations was assessed using the average pairwise agreement and Fleiss’ multi-rater measures. We believe that MLSA is a beneficial resource for sentiment analysis research, algorithms and applications that focus on the German language.
In Fachsprache 1–2/2011 Czicza and Hennig proposed a model that explains correlations between grammatical features and pragmatic conditions in communication in sciences. This model now serves as a basis for the practical analysis of the scientific degree of any written text. The authors present a method of analyzing written texts concerning the four parameters ‚economy’‚ precision’, ‚impersonalization’ and ‚discussion’. The method is being developed by the analysis of a prototypical scientific article on the one hand and a non-scientific text on the other hand. The two texts serve as the two poles of the scale of scientificity. Finally, the applicability of the model and its operationalization is being illustrated by the analysis of two examples of texts that are located between the two poles (one popular scientific text and one juridical teaching article).
We report an ethnographic and field-experiment-based study of time intervals in Amondawa, a Tupi language and culture of Amazonia. We analyse two Amondawa time interval systems based on natural environmental events (seasons and days), as well as the Amondawa system for categorising lifespan time (“age”). Amondawa time intervals are exclusively event-based, as opposed to time-based (i.e. they are based on event-duration, rather than measured abstract time units). Amondawa has no lexicalised abstract concept of time and no practices of time reckoning, as conventionally understood in the anthropological literature. Our findings indicate that not only are time interval systems and categories linguistically and culturally specific, but that they do not depend upon a universal “concept of time”. We conclude that the abstract conceptual domain of time is not a human cognitive universal, but a cultural historical construction, semiotically mediated by symbolic and cultural-cognitive artefacts for time reckoning.
Conversation Analysis (CA) and Discursive Psychology (DP) reject the view that assumptions
about cognitive processes should be used to account for discursive phenomena. Instead, cognitive
issues are respecified as discursive phenomena. Discursive psychologists do this by
studying discursive practices of talking about mental phenomena and using mental predicates.
This approach is exemplified by a study of the use of constructions with German verstehen
(‘to understand’) in conversation. Some conversation analysts take another approach,
namely, inquiring into how participants display mental states in talk-in-interaction. This is
exemplified by a study of how grammatical constructions are used to display different types
of inferences drawn from a partner’s prior turn. It will be argued that the constructivist, antiessentialist
stance which CA and DP take with regard to cognition is a prosperous line of
research, which has much in its favor from a methodological point of view. However, it
can be shown that tacit assumptions about cognitive processes are still inevitable when
doing CA and DP. As a conclusion, the paper pleads for an enhanced awareness of how cognitive
processes come into play when analysing talk-in-interaction and it advocates the integration
of a more explicit cognitive perspective into research on talk-in-interaction.
Der vorliegende Beitrag soll nun diese Diskussion um Sinn, Unsinn und Definition der Kategorie "Satz" als Grundeinheit der gesprochenen Sprache nicht fortsetzen. Ich will vielmehr kurz darlegen, in welcher Weise ein traditioneller Satzbegriff m.E. für die Analyse gesprochener Sprache relevant ist, und wie er sich zu gesprächsanalytischen Kategorien wie "Turn" und "Turnkonstruktionseinheit" verhält. Dies geschieht aber nur als Voraussetzung, um sodann die traditionelle Fragerichtung umzukehren: Anstatt zu fragen, warum in Gesprächen oft nicht-sentenzielle Strukturen vorkommen, gehe ich vom Befund aus, dass ein großer Teil von Turns aus nicht-sentenziellen Strukturen besteht und frage umgekehrt, wieso in Gesprächen überhaupt Sätze (im Sinne der eingangs gegebenen klassischen Definition) verwendet werden. Den Schlüssel zur Antwort suche ich dabei in der temporalen Struktur der Äußerungsproduktion und der Position, die Sätze in Bezug auf diese einnehmen.
This paper deals with a case study of a first visit of a person with hearing loss to her family doctor. In the first part of the paper, basic properties of doctor-patient interaction, which are also relevant for treatment of hearing loss, are outlined: the relevance of institutional conditions for interaction, asymmetries between the participants, goal-orientation, specific conditions of trust, and the relevance of the specific genre of doctor-patient interaction. The second part of the paper presents a case study, which focuses on three interactional phenomena: a) the negotiation of the hearing loss as an existential threat to the patient and her identity; b) the discrepancy of illness theories between doctor and patient; c) the collaborative work of negotiating an intersubjectively viable description of the experience of hearing loss.
In developing an interdisciplinary approach integrating Conversation Analysis (“CA”), audiology and User Centered Design, the applied goal of this international collaboration is to analyze real-world social interaction from the perspective of the participants in order to build an empirical basis for innovation in the field of communication with hearing impairment and hearing aid use. In reviewing theory, methodology and analysis of eight cases analyzed in this volume, the editors assess the potential of application for the various stakeholders in communication with hearing loss and hearing aids, including the estimated impact factor. The chapter closes with a consideration of desiderata for future research.
This paper presents the system architecture as well as the underlying workflow of the Extensible Repository System of Digital Objects (ERDO) which has been developed for the sustainable archiving of language resources within the Tübingen CLARIN-D project. In contrast to other approaches focusing on archiving experts, the described workflow can be used by researchers without required knowledge in the field of long-term storage for transferring data from their local file systems into a persistent repository.
This paper describes the ongoing work to integrate WebLicht into the CLARIN infrastructure. It introduces the CLARIN infrastructure for scholars in the humanities and social sciences as well as WebLicht - an orchestration and execution environment that is built upon Service Oriented Architecture principles. The integration of WebLicht into the CLARIN infrastructure involves adapting it to the standards and practices used within CLARIN, including distributed repositories, CMDI metadata, and persistent identifiers.
Creating and maintaining metadata for various kinds of resources requires appropriate tools to assist the user. The paper presents the metadata editor ProFormA for the creation and editing of CMDI (Component Metadata Infrastructure) metadata in web forms. This editor supports a number of CMDI profiles currently being provided for different types of resources. Since the editor is based on XForms and server-side processing, users can create and modify CMDI files in their standard browser without the need for further processing. Large parts of ProFormA are implemented as web services in order to reuse them in other contexts and programs.
Kochrezepte gibt es, seit Menschen kochen. Aufgeschrieben werden sie, seit Menschen sich dafür interessieren sie aufzuschreiben – zur eigenen Erinnerung, zur Information anderer oder zur Fixierung einer gemeinsamen Tradition. Wie Kochrezepte aufgeschrieben werden, hat sich im Laufe der Jahrhunderte ständig verändert. Zum Beispiel haben deutsche Kochrezeptschreiber die verschiedensten Verbformen ausprobiert. Berühmt ist der adhortative Konjunktiv: Man nehme ... Davon und von vielen weiteren Verbformen in Kochrezepten handelt dieser Beitrag.