Refine
Year of publication
Document Type
- Part of a Book (99)
- Article (85)
- Conference Proceeding (30)
- Book (2)
- Other (2)
- Review (2)
Is part of the Bibliography
- no (220) (remove)
Keywords
- Deutsch (68)
- Konversationsanalyse (22)
- Computerlinguistik (20)
- Englisch (15)
- Korpus <Linguistik> (15)
- Mehrsprachigkeit (14)
- Interaktion (13)
- Semantik (13)
- Automatische Sprachanalyse (11)
- Sprachpolitik (11)
Publicationstate
- Postprint (220) (remove)
Reviewstate
- (Verlags)-Lektorat (94)
- Peer-Review (86)
- Peer-review (6)
- Verlags-Lektorat (3)
- Review-Status-unbekannt (1)
- Zweitveröffentlichung (1)
Publisher
- Benjamins (35)
- Springer (29)
- Elsevier (9)
- Oxford University Press (9)
- Lang (5)
- Sage (4)
- Wiley (4)
- Winter (4)
- Association for Computing Machinery (3)
- Edinburgh University Press (3)
"Standard language" is a contested concept, ideologically, empirically and theoretically. This is particularly true for a language such as German, where the standardization of the spoken language was based on the written standard and was established with respect to a communicative situation, i.e. public speech on stage (Bühnenaussprache), which most speakers never come across. As a consequence, the norms of the oral standard exhibit many features which are infrequent in the everyday speech even of educated speakers. This paper discusses ways to arrive at a more realistic conception of (spoken) standard German, which will be termed "standard usage". It must be founded on empirical observations of speakers linguistic choices in everyday situations. Arguments in favor of a corpus-based notion of standard have to consider sociolinguistic, political, and didactic concerns. We report on the design of a large study of linguistic variation conducted at the Institute for the German Language (project "Variation in Spoken German", Variation des gesprochenen Deutsch) with the aim of arriving at a representative picture of "standard usage" in contemporary German. It systematically takes into account both diatopic variation covering the multi-national space in which German an official language, and diastratic variation in terms of varying degrees of formality. Results of the study of phonetic and morphosyntactic variation are discussed. At least for German, a corpus-based notion of "standard usage" inevitably includes some degree of pluralism concerning areal variation, and it needs to do justice to register-based variation as well.
Authors like Fillmore 1986 and Goldberg 2006 have made a strong case for regarding argument omission in English as a lexical and construction-based affordance rather than one based on general semantico-pragmatic constraints. They do not, however, address the question of how grammatical restrictions on null complementation might interact with broader narrative conventions, in particular those of genre. In this paper, we attempt to remedy this oversight by presenting a comprehensive overview of genre-based argument omissions and offering a construction-based analysis of genre-based omission conventions. We consider five genre-based omission types: instructional imperatives (Culy 1996, Bender 1999), labelese, diary style (Haegeman 1990), match reports (Ruppenhofer 2004) and quotative clauses. We show that these omission types share important traits; all, for example, have anaphoric rather than indefinite construals. We also show, however, that the omission types differ from each other in idiosyncratic ways. We then address several interrelated representational problems posed by the grammatical treatment of genre-based omissions. For example, the constructions that represent genre-based omission conventions must interact with the lexical entries of verbs, many of which do not generally permit omitted arguments. Accordingly, we offer constructional analyses of genre-based omissions that allow constructions to override lexical valence constraints.
Our paper deals with the use of ICH WEIß NICHT (‘I don’t know’) in German talk-in-interaction. Pursuing an Interactional Linguistics approach, we identify different interactional uses of ICH WEIß NICHT and discuss their relationship to variation in argument structure (SV (O), (O)VS, V-only). After ICH WEIß NICHT with full complementation, speakers emphasize their lack of knowledge or display reluctance to answer. In contrast, after variants without an object complement, in contrast, speakers display uncertainty about the truth of the following proposition or about its sufficiency as an answer. Thus, while uses with both subject and object tend to close a sequence or display lack of knowledge, responses without an object, in contrast, function as a prepositioned epistemic hedge or a pragmatic marker framing the following TCU. When ICH WEIß NICHT is used in response to a statement, it indexes disagreement (independently from all complementation patterns).
In this paper we present an evaluation of rule-based morphological components for German for use in an interactive editing environment. The criteria for the evaluation are deduced from the intended use of these components, namely availability, performance, programming interfaces, and analysis quality. We evaluated systems developed and maintained since decades as well as new systems. However, we note serious general shortcomings when looking closer at recent implementations and come to the conclusion that the oldest system is the only one that satisfies our requirements.
This paper is concerned with a novel methodology for generating phonetic questions used in tree-based state tying for speech recognition. In order to implement a speech recognition system, language-dependent knowledge which goes beyond annotated material is usually required. The approach presented here generates phonetic questions for decision trees are based on a feature table that summarizes the articulatory characteristics of each sound. On the one hand, this method allows better language-specific triphone models to be defined given only a feature-table as linguistic input. On the other hand, the feature-table approach facilitates efficient definition of triphone models for other languages since again only a feature table for this language is required. The approach is exemplified with speech recognition systems for English and Thai.
The English language has taken advantage of the Digital Revolution to establish itself as the global language; however, only 28.6 %of Internet users speak English as their native language. Machine Trans-lation (MT) is a powerful technology that can bridge this gap. In devel-opment since the mid-20th century, MT has become available to every Internet user in the last decade, due to free online MT services. This paper aims to discuss the implications that these tools may have for the privacy of their users and how they are addressed by EU data protec-tion law. It examines the data-flows in respect of the initial processing (both from the perspective of the user and the MT service provider) and potential further processing that may be undertaken by the MT service provider.
The paper presents the results of a joint effort of a group of multimodality researchers and tool developers to improve the interoperability between several tools used for the annotation and analysis of multimodality. Each of the tools has specific strengths so that a variety of different tools, working on the same data, can be desirable for project work. However this usually requires tedious conversion between formats. We propose a common exchange format for multimodal annotation, based on the annotation graph (AG) formalism, which is supported by import and export routines in the respective tools. In the current version of this format the common denominator information can be reliably exchanged between the tools, and additional information can be stored in a standardized way.
This article describes an English Zulu learners’ dictionary that is part of a larger set of information tools, namely an online Zulu course, an e-dictionary of possessives (which was implemented earlier) accompanied by training software offering translation tasks on several levels, and an ontology of morphemic items categorizing and describing all parts of speech of Zulu. The underlying lexicographic database contains the usual type of lexicographic data, such as translation equivalents and their respective morphosyntactic data, but its entries have been extended with data related to the lessons of the online course in order to enable the learner to link both tools autonomously. The ‘outer matter’ is integrated into the website in the form of several texts on additional web pages (how-to-use, typical outputs, grammar tables, information on morphosyntactic rules, etc.). The dictionary comprises a modular system, where each module fulfils one of the necessary functions.
Just like most varieties of West Germanic, virtually all varieties of German use a construction in which a cognate of the English verb 'do' (standard German 'tun') functions as an auxiliary and selects another verb in the bare infinitive, a construction known as 'do'-periphrasis or 'do'-support. The present paper provides an Optimality Theoretic (OT) analysis of this phenomenon. It builds on a previous analysis by Bader and Schmid (An OT-analysis of 'do'-support in Modern German, 2006) but (i) extends it from root clauses to subordinate clauses and (ii) aims to capture all of the major distributional patterns found across (mostly non-standard) varieties of German. In so doing, the data are used as a testing ground for different models of German clause structure. At first sight, the occurrence of 'do' in subordinate clauses, as found in many varieties, appears to support the standard CP-IP-VP analysis of German. In actual fact, however, the full range of data turn out to challenge, rather than support, this model. Instead, I propose an analysis within the IP-less model by Haider (Deutsche Syntax - generativ. Vorstudien zur Theorie einer projektiven Grammatik, Narr, Tübingen, 1993 et seq.). In sum, the 'do'-support data will be shown to have implications not only for the analysis of clause structure but also for the OT constraints commonly assumed to govern the distribution of 'do', for the theory of non-projecting words (Toivonen in Non-projecting words, Kluwer, Dordrecht, 2003) as well as research on grammaticalization.
Although there is a growing interest of policy makers in higher education issues (especially on an international scale), there is still a lack of theoretically well-grounded comparative analyses of higher education policy. Even broadly discussed topics in higher education research like the potential convergence of European higher education systems in the course of the Bologna Process suffer from a thin empirical and comparative basis. This paper aims to deal with these problems by addressing theoretical questions concerning the domestic impact of the Bologna Process and the role national factors play in determining its effects on cross-national policy convergence. It develops a distinct theoretical approach for the systematic and comparative analysis of cross-national policy convergence. In doing so, it relies upon insights from related research areas — namely literature on Europeanization as well as studies dealing with cross-national policy convergence.
MRI data of German vowels and consonants was acquired for 9 speakers. In this paper tongue contours for the vowels were analyzed using the three-mode factor analysis technique PARAFAC. After some difficulties, probably related to what constitutes an adequate speaker sample for this three-mode technique to work, a stable two-factor solution was extracted that explained about 90% of the variance. Factor 1 roughly captured the dimension low back to high front; Factor 2 that from mid front to high back. These factors are compared with earlier models based on PARAFAC. These analyses were based on midsagittal contours; the paper concludes by illustrating from coronal and axial sections how non-midline information could be incorporated into this approach.
Dieser Artikel analysiert am Beispiel eines Racletteessens unter Freunden, wie innerhalb einer langen Sequenz das Warten auf den Beginn des Essens strukturiert wird. Während der fast 50 Minuten, die zwischen der Ankunft der ersten Gäste sowie dem Beginn des Essens vergehen, orientieren sich die Teilnehmer auf unterschiedliche Weise zum Warten als Aktivität. Das sukzessive Eintreffen der Gäste führt jeweils zu Eröffnungssequenzen innerhalb dieser Wartezeit. Anhand von Auszügen dieser Zeitspanne verfolgt die Analyse, wie sich die Teilnehmer zu dieser Zeitlichkeit des Wartens und (Noch-nicht-)Beginnens orientieren und wie sie den Anfang des Essens gemeinsam konstruieren.
We continue the study of the reproducibility of Propp’s annotations from Bod et al. (2012). We present four experiments in which test subjects were taught Propp’s annotation system; we conclude that Propp’s system needs a significant amount of training, but that with sufficient time investment, it can be reliably trained for simple tales.
Antonymy is a relation of lexical opposition which is generally considered to involve (i) the presence of a scale along which a particular property may be graded, and hence both (ii) gradability of the corresponding lexical items and (iii) typical entailment relations. Like other types of lexical opposites, antonyms typically differ only minimally: while denoting opposing poles on the relevant dimension of difference, they are similar with respect to other components of meaning. This paper presents examples of antonymy from the domain of speech act verbs which either lack some of these typical attributes or show problems in the application of these. It discusses several different proposals for the classification of these atypical examples.
HMMs are the dominating technique used in speech recognition today since they perform well in overall phone recognition. In this paper, we show the comparison of HMM methods and machine learning techniques, such as neural networks, decision trees and ensemble classifiers with boosting and bagging in the task of articulatory-acoustic feature classification. The experimental results show that HMM methods work well for the classification of such features as vocalic. However, decision tree and bagging outperform HMMs for the fricative classification task since the data skewness is much higher than for the feature vocalic classification task. This demonstrates that HMMs do not perform as well as decision trees and bagging in highly skewed data settings.
Precise multimodal studies require precise synchronisation between audio and video signals. However, raw audio and audio from video recordings can be out of sync for several reasons. In order to re-synchronise them, a dynamic programming (DP) approach is presented here. Traditionally, DP is performed on the rectangular distance matrix comparing each value in signal A with each value in signal B. Previous work limited the search space using for example the Sakoe Chiba Band (Sakoe and Chiba, 1978). However, the overall space of the distance matrix remains identical. Here, a tunnel matrix and its according DP-algorithm are presented. The matrix contains merely the computed distance of two signals to a pre-specified bandwidth and the computational cost is equally reduced. An example implementation demonstrates the functionality on artificial data and on data from real audio and video recordings.
Automatic recognition of speech, thought, and writing representation in German narrative texts
(2013)
This article presents the main results of a project, which explored ways to recognize and classify a narrative feature—speech, thought, and writing representation (ST&WR)—automatically, using surface information and methods of computational linguistics. The task was to detect and distinguish four types—direct, free indirect, indirect, and reported ST&WR—in a corpus of manually annotated German narrative texts. Rule-based as well as machine-learning methods were tested and compared. The results were best for recognizing direct ST&WR (best F1 score: 0.87), followed by indirect (0.71), reported (0.58), and finally free indirect ST&WR (0.40). The rule-based approach worked best for ST&WR types with clear patterns, like indirect and marked direct ST&WR, and often gave the most accurate results. Machine learning was most successful for types without clear indicators, like free indirect ST&WR, and proved more stable. When looking at the percentage of ST&WR in a text, the results of machine-learning methods always correlated best with the results of manual annotation. Creating a union or intersection of the results of the two approaches did not lead to striking improvements. A stricter definition of ST&WR, which excluded borderline cases, made the task harder and led to worse results for both approaches.
Feminine forms of job titles raise great interest in many countries. However, it is still unknown how they shape stereotypical impressions on warmth and competence dimensions among female and male listeners. In an experiment with fictitious job titles men perceived women described with feminine job titles as significantly less warm and marginally less competent than women with masculine job titles, which led to lower willingness to employ them. No such effects were observed among women.
This paper provides a unified semantic and discourse pragmatic analysis of the German particle nämlich, traditionally described as having a specificational and an explanative reading. Our claim is that nämlich is a discourse marker which signals that the expression it is attached to is a short (elliptic) answer to a salient implicit question about the previous utterance. We show how both the explanative and the specificational reading can be derived from this more general semantic contribution. In addition we discuss some cross linguistic consequences of our analysis.
Beyond the stars: exploiting free-text user reviews to improve the accuracy of movie recommendations
(2009)
In this paper we show that the extraction of opinions from free-text reviews can improve the accuracy of movie recommendations. We present three approaches to extract movie aspects as opinion targets and use them as features for the collaborative filtering. Each of these approaches requires different amounts of manual interaction. We collected a data set of reviews with corresponding ordinal (star) ratings of several thousand movies to evaluate the different features for the collaborative filtering. We employ a state-of-the-art collaborative filtering engine for the recommendations during our evaluation and compare the performance with and without using the features representing user preferences mined from the free-text reviews provided by the users. The opinion mining based features perform significantly better than the baseline, which is based on star ratings and genre information only.
Blogg Dir deinen Urlaub nach Tunesien! Zur Erläuterung des Musters [VImp PROPReflexivDat NPAkk]
(2020)
In diesem Beitrag soll das Muster [VImp PROPReflexivDat NPAkk] semantisch und syntaktisch erläutert werden. Dieses Muster, das semantisch mit Verben des Erwerbens wie anschaffen korreliert, wird auch im Zusammenhang mit Kommunikationsverben wie bloggen und facebooken sowie mit dem Kontaktverb rubbeln belegt. Mithilfe des Konzeptes der Koerzion bzw. der semantischen Anpassung soll das Kovorkommen des erwänhten Musters mit diesen Verben beschrieben und erklärt werden. Als empirische Quelle dient das Korpus für das Deutsche 2012 und 2014 aus den Corpora from the Web. Die vorliegende Untersuchung ist im Rahmen meiner Dissertationsarbeit zum Thema Argumentstruktur und Bedeutung medialer Kommunikationsverben des Deutschen und des Spanischen im Sprachvergleich durchgeführt worden.
In this article, we examine the effectiveness of bootstrapping supervised machine-learning polarity classifiers with the help of a domain-independent rule-based classifier that relies on a lexical resource, i.e., a polarity lexicon and a set of linguistic rules. The benefit of this method is that though no labeled training data are required, it allows a classifier to capture in-domain knowledge by training a supervised classifier with in-domain features, such as bag of words, on instances labeled by a rule-based classifier. Thus, this approach can be considered as a simple and effective method for domain adaptation. Among the list of components of this approach, we investigate how important the quality of the rule-based classifier is and what features are useful for the supervised classifier. In particular, the former addresses the issue in how far linguistic modeling is relevant for this task. We not only examine how this method performs under more difficult settings in which classes are not balanced and mixed reviews are included in the data set but also compare how this linguistically-driven method relates to state-of-the-art statistical domain adaptation.
Complex common names such as Indian elephant or green tea denote a certain type of entity, viz. kinds. Moreover, those kinds are always subkinds of the kind denoted by their head noun. Establishing such subkinds is essentially the task of classifying modifiers that are a defining trait of endocentrically structured complex common names. Examining complex common names of different lexico-syntactic types(NN compounds, N+N syntagmas, NP/PP syntagmas, A+N syntagmas) and from different languages (particularly English, German and French) it can be shown that complex common names are subject to language- independent formal and semantic constraints. In particular, complex common names qualify as name-like expressions in that they tend to be deficient in terms of formal complexity and semantic compositionality.
Within cognitive linguistics, there is an increasing awareness that the study of linguistic phenomena needs to be grounded in usage. Ideally, research in cognitive linguistics should be based on authentic language use, its results should be replicable, and its claims falsifiable. Consequently, more and more studies now turn to corpora as a source of data. While corpus-based methodologies have increased in sophistication, the use of corpus data is also associated with a number of unresolved problems. The study of cognition through off-line linguistic data is, arguably, indirect, even if such data fulfils desirable qualities such as being natural, representative and plentiful. Several topics in this context stand out as particularly pressing issues. This discussion note addresses (1) converging evidence from corpora and experimentation, (2) whether corpora mirror psychological reality, (3) the theoretical value of corpus linguistic studies of ‘alternations’, (4) the relation of corpus linguistics and grammaticality judgments, and, lastly, (5) the nature of explanations in cognitive corpus linguistics. We do not claim to resolve these issues nor to cover all possible angles; instead, we strongly encourage reactions and further discussion.
Communication of stereotypes in the classroom: biased language use of German and Turkish adolescents
(2014)
Little is known about the linguistic transmission and maintenance of mutual stereotypes in interethnic contexts. This field study, therefore, investigated the linguistic expectancy bias (LEB) and the linguistic intergroup bias (LIB) among German and Turkish adolescents (13 to 20 years) in the school context. The LEB refers to the general phenomenon of describing stereotypes more abstractly. The LIB is the tendency to use language abstraction for in-group protective reasons. Results revealed an unmoderated LEB, whereas the LIB only occurred when foreigners were in the numerical majority, the classroom composition was perceived as a learning disadvantage, or the interethnic conflict frequency was high. These findings provide first evidence for the use of both LEB and LIB in an interethnic classroom setting.
Most research on ethnicity has focused on visual cues. However, accents are strong social cues that can match or contradict visual cues. We examined understudied reactions to people whose one cue suggests one ethnicity, whereas the other cue contradicts it. In an experiment conducted in Germany, job candidates spoke with an accent either congruent or incongruent with their (German or Turkish) appearance. Based on ethnolinguistic identity theory, we predicted that accents would be strong cues for categorization and evaluation. Based on expectancy violations theory we expected that incongruent targets would be evaluated more extremely than congruent targets. Both predictions were confirmed: accents strongly influenced perceptions and Turkish-looking German-accented targets were perceived as most competent of all targets (and additionally most warm). The findings show that bringing together visual and auditory information yields a more complete picture of the processes underlying impression formation.
Several studies have examined effects of explicit task demands on eye movements in reading. However, there is relatively little prior research investigating the influence of implicit processing demands. In this study, processing demands were manipulated by means of a between-subject manipulation of comprehension question difficulty. Consistent with previous results from Wotschack and Kliegl, the question difficulty manipulation influenced the probability of regressing from late in sentences and re-reading earlier regions; readers who expected difficult comprehension questions were more likely to re-read. However, this manipulation had no reliable influence on eye movements during first-pass reading of earlier sentence regions. Moreover, for the subset of sentences that contained a plausibility manipulation, the disruption induced by implausibility was not modulated by the question manipulation. We interpret these results as suggesting that comprehension demands influence reading behavior primarily by modulating a criterion for comprehension that readers apply after completing first-pass processing.
In this paper the authors briefly outline editing functions which use methods from computational linguistics and take the structures of natural languages into consideration. Such functions could reduce errors and better support writers in realizing their communicative goals. However, linguistic methods have limits, and there are various aspects software developers have to take into account to avoid creating a solution looking for a problem: Language-aware functions could be powerful tools for writers, but writers must not be forced to adapt to their tools.
Content analysis provides a useful and multifaceted, methodological framework for Twitter analysis. CAQDAS tools support the structuring of textual data by enabling categorising and coding. Depending on the research objective, it may be appropriate to choose a mixed-methods approach that combines quantitative and qualitative elements of analysis and plays out their respective advantages to the greatest possible extent while minimising their shortcomings. In this chapter, we will discuss CAQDAS speech act analysis of tweets as an example of software-assisted content analysis. We start with some elementary thoughts on the challenges of the collection and evaluation of Twitter data before we give a brief description of the potentials and limitations of using the software QDA Miner (as one typical example for possible analysis programmes). Our focus will lie on analytical features that can be particularly helpful in speech act analysis of tweets.
This paper deals with different views of lexical semantics. The focus is on the relationship between lexical expressions and conceptual components. First the assumptions about lexicalization and decompositionality of concepts shared by the most semanticists are presented, followed by a discussion of the differences between two-level-semantics and one-level-semantics. The final part is concentrated on the interpretation of conceptual components in situations of communication.
Are borrowed neologisms accepted more slowly into the German language than German words resulting from the application of wrd formation rules? This study addresses this question by focusing on two possible indicators for the acceptance of neologisms: a) frequency development of 239 German neologisms from the 1990s (loanwords as well as new words resulting from the application of word formation rules) in the German reference corpus DEREKO and b) frequency development in the use of pragmatic markers (‘flags’, namely quotation marks and phrases such as sogenannt ‘so-called’) with these words. In the second part of the article, a psycholinguistic approach to evaluating the (psychological) status of different neologisms and non-words in an experimentally controlled study and plans to carry out interviews in a field test to collect speakers’ opinions on the acceptance of the analysed neologisms are outlined. Finally, implications for the lexicographic treatment of both types of neologisms are discussed.
Ancient Chinese poetry is constituted by structured language that deviates from ordinary language usage; its poetic genres impose unique combinatory constraints on linguistic elements. How does the constrained poetic structure facilitate speech segmentation when common linguistic and statistical cues are unreliable to listeners in poems? We generated artificial Jueju, which arguably has the most constrained structure in ancient Chinese poetry, and presented each poem twice as an isochronous sequence of syllables to native Mandarin speakers while conducting magnetoencephalography (MEG) recording. We found that listeners deployed their prior knowledge of Jueju to build the line structure and to establish the conceptual flow of Jueju. Unprecedentedly, we found a phase precession phenomenon indicating predictive processes of speech segmentation—the neural phase advanced faster after listeners acquired knowledge of incoming speech. The statistical co-occurrence of monosyllabic words in Jueju negatively correlated with speech segmentation, which provides an alternative perspective on how statistical cues facilitate speech segmentation. Our findings suggest that constrained poetic structures serve as a temporal map for listeners to group speech contents and to predict incoming speech signals. Listeners can parse speech streams by using not only grammatical and statistical cues but also their prior knowledge of the form of language.
One major issue in the accomplishment of contrasts in conversation is lexical choice of items which carry the semantic Ioad of the two states of affair which are represented as being opposed to one another. These items or expressions are co-selected to be understood as being contrastively related to each other. In this paper, it is argued that the activity of contrasting itself provides them with a specific local opposite meaning which they would not obtain in other contexts. Practices of contrastingare thus seen as an example of conversational activities which creatively and systematically affect situated meanings. Basedon data from various genres, such as meetings, mediation sessions and conversations, the paper discusses two practices of contrasting, their sequential construction and their interpretative effects. It is concluded that the interpretative effects of conversational contrasting rest on the sequential deployment oflinguistic resources and on the cognitive procedures of frame-based interpretation and constructing a maximally contrastive interpretation for the co-selected expressions.
In the context of the HyTex project, our goal is to convert a corpus into a hypertext, basing conversion strategies on annotations which explicitly mark up the text-grammatical structures and relations between text segments. Domain-specific knowledge is represented in the form of a knowledge net, using topic maps. We use XML as an interchange format. In this paper, we focus on a declarative rule language designed to express conversion strategies in terms of text-grammatical structures and hypertext results. The strategies can be formulated in a concise formal syntax which is independend of the markup, and which can be transformed automatically into executable program code.
Coronaparty, Jo-jo-Lockdown und Mask-have – Wortschatzerweiterung während des Corona-Stillstands
(2021)
Das Bild von der 'Sprache der DDR' in der alten Bundesrepublik oder: Haben sie so gesprochen?
(2004)
In diesem Beitrag geht es vor allem um die Frage, wie das Smartphone in der Alltagskommunikation als gemeinsamer Bezugspunkt relevant gemacht wird und wie sich die Reaktionen der Interagierenden zum auf dem Display Gezeigten gestalten. Es zeigt sich, dass diese in mehrere responsive Schritte unterteilt werden, in denen die Aufmerksamkeit gebündelt und das Display fokussiert wird sowie eine Abstimmung darüber erfolgt, wie das Gezeigte zu kontextualisieren ist.
Contemporary studies on the characteristics of natural language benefit enormously from the increasing amount of linguistic corpora. Aside from text and speech corpora, corpora of computer-mediated communication (CMC) Position themselves between orality and literacy, and beyond that provide in- sight into the impact of "new", mainly intemet-based media on language beha- viour. In this paper, we present an empirical attempt to work with annotated CMC corpora for the explanation of linguistic phenomena. In concrete terms, we implement machine leaming algorithms to produce decision trees that reveal rules and tendencies about the use of genitive markers in German.
This paper deals with multiword lexemes (MWLs), focussing on two types of verbal MWLs: verbal idioms and support verb constructions. We discuss the characteristic properties of MWLs, namely nonstandard compositionality, restricted substitutability of components, and restricted morpho-syntactic flexibility, and we show how these properties may cause serious problems during the analysis, generation, and transfer steps of machine translation systems. In order to cope with these problems, MT lexicons need to provide detailed descriptions of MWL properties. We list the types of information which we consider the necessary minimum for a successful processing of MWLs, and report on some feasibility studies aimed at the automatic extraction of German verbal multiword lexemes from text corpora and machine-readable dictionaries.
In this article, we explore the feasibility of extracting suitable and unsuitable food items for particular health conditions from natural language text. We refer to this task as conditional healthiness classification. For that purpose, we annotate a corpus extracted from forum entries of a food-related website. We identify different relation types that hold between food items and health conditions going beyond a binary distinction of suitability and unsuitability and devise various supervised classifiers using different types of features. We examine the impact of different task-specific resources, such as a healthiness lexicon that lists the healthiness status of a food item and a sentiment lexicon. Moreover, we also consider task-specific linguistic features that disambiguate a context in which mentions of a food item and a health condition co-occur and compare them with standard features using bag of words, part-of-speech information and syntactic parses. We also investigate in how far individual food items and health conditions correlate with specific relation types and try to harness this information for classification.
Dieser Beitrag vergleicht die Ansätze ,Linguistic Landscapes' (LL) und ,Spot German' (SG) in Hinblick auf ihr Potenzial für die Untersuchung des Vorkommens und der Funktionen der deutschen Sprache in Regionen außerhalb des deutschsprachigen Kerngebietes. Als Beispiele wurden eine LL-Studie im Baltikum sowie eine SG-Untersuchung auf Zypern gewählt. Der Vergleich zeigt, dass beide Methoden - trotz ihrer unterschiedlichen Präzision - ähnliche Aussagen zur Rolle des Deutschen erlauben: In beiden Ländern erscheint Deutsch als „Ergänzungssprache“ zu den gesellschaftlichen Hauptsprachen in bestimmten Nischen, z.B. im Tourismus und in Verbindung mit bestimmten Firmen und Produkten.
When collecting linguistic data using translation tasks, stimuli can be presented in written or in oral form. In doing so, there is a possibility that a systematic source of error can occur that can be traced back to the selected survey method and which can influence the results of the translation tasks. This contribution investigates whether and to what extent both of the aforementioned survey methods result in divergent results when using translation tasks. For this investigation, 128 informants provided linguistic data; each informant had to translate 25 Wenker sentences from Standard German into either East Swabian, Lechrain or West Central Bavarian dialect, as the case may be. The results show two tendencies. First, written stimuli lead to a slightly higher number of dialectal translation in segmental variables. Second, when oral stimuli are used, syntactic and lexical variables are translated significantly more often in such a manner that they diverge from the template. The results can be explained in terms of varying cognitive processing operations and the constraints of human working memory. When collecting data in the future, these tendencies should be taken into account.
Linguistic Landscapes (LL) sind in der internationalen Soziolinguistik und verwandten Disziplinen in aller Munde. Seit Mitte der 2000er Jahre sind Studien, die sich als Teil dieses Ansatzes verstehen, wie Pilze aus dem Boden geschossen. Seit 2008 hat es in fast jährlichem Rhythmus gut besuchte Tagungen gegeben, die sich ausschließlich mit Linguistic Landscapes beschäftigen - sowohl mit Fallstudien aus aller Welt als auch mit theoretischen und methodologischen Fragen. Folgerichtig sind nicht nur eine Vielzahl von Einzelaufsätzen erschienen, es hat auch mehrere Sammelveröffentlichungen gegeben, und seit 2015 erscheint ein eigenes Journal unter dem Titel „Linguistic Landscapes“ (vgl. Gorter 2013 für einen Überblick über die Entwicklung des Ansatzes).
Obwohl auch Wissenschaftler, die im deutschsprachigen Raum tätig sind, sich in den letzten Jahren den Linguistic Landscapes gewidmet haben, hat die Methode in deutschsprachigen Publikationen jedoch bisher nur einen vergleichsweise geringen Stellenwert eingenommen. Dieser Beitrag möchte somit zum einen Grundlagenarbeit leisten, indem er die Idee der Linguistic Landscapes noch einmal vorstellt und seine Entwicklung der vergangenen Jahre nachzeichnet. Zum anderen soll im Kontext dieses Bandes der Nutzen des Ansatzes für die Analyse von Sprachen von Migrantengruppen diskutiert werden. Schließlich wird der Beitrag durch einige Bemerkungen dazu abgerundet, in welchem Maße die Untersuchung von LL einen Nutzwert haben kann, der über wissenschaftliche Kreise hinausgeht. Grundlage für diesen Beitrag sind internationale Veröffentlichungen der letzten Jahre, vor allem aber gehen Erfahrungen aus eigenen Studien mit ein, die wir seit 2007 mit unterschiedlichen Zielsetzungen im Baltikum und in Deutschland durchgeführt haben.
Dieser Beitrag widmet sich der Analyse des Zusammenspiels sprachlich-hörbarer und sichtbar-kinesischer Praktiken, die beim alltäglichen Erzählen eingesetzt werden. Im Rahmen einer konversationsanalytisch basierten Untersuchung von Videoaufnahmen deutscher Alltagsgespräche wird die Bandbreite alltäglicher narrativer Praktiken in der face-to-face-Kommunikation aufgezeigt. Dies erfolgt exemplarisch anhand zweier Beispiele, in denen Einstieg, Ausgestaltung sowie Beendigung der Erzählung unter unterschiedlichen sequentiellen und multimodalen Bedingungen vollzogen werden. Die Untersuchung unterstreicht einerseits die Indexikalität alltäglicher narrativer Praktiken, andererseits die Notwendigkeit einer interaktionalen Narratologie, die diese Praktiken als Produkt sprachlicher, verkörperter und räumlicher Ressourcen sowie der Zusammenarbeit mehrerer Teilnehmer analysiert und konzeptualisiert.
Different Views on Markup
(2010)
In this chapter, two different ways of grouping information represented in document markup are examined: annotation levels, referring to conceptual levels of description, and annotation layers, referring to the technical realisation of markup using e.g. document grammars. In many current XML annotation projects, multiple levels are integrated into one layer, often leading to the problem of having to deal with overlapping hierarchies. As a solution, we propose a framework for XML-based multiple, independent XML annotation layers for one text, based on an abstract representation of XML documents with logical predicates. Two realisations of the abstract representation are presented, a Prolog fact base format together with an application architecture, and a specification for XML native databases. We conclude with a discussion of projects that have currently adopted this framework.
Digressions
(2015)
Der Beitrag von Bruno Strecker Digressions ist auf Französisch geschrieben (der Muttersprache von Jacqueline Kubczak) und handelt von unterschiedlichen Exkursen. Er macht die Verbindung zwischen Kommunikationssituation und Arten der Exkurse sichtbar und bietet eine darauf basierende Typologie der Exkurse an. In einem zweiten Schritt werden die formalen Möglichkeiten, einen Exkurs einzuleiten und zu formulieren, dargestellt (z. B. durch Appositionen, Parenthesen, festgelegte Ausdrucksformen wie A propos xxx, Ça me rappelle oder nicht eingebettete Phrasen). Schließlich zeigt er, wie man aus dem Exkurs wieder „in die Spur“ kommt.
This chapter addresses the requirements and linguistic foundations of automatic relational discourse analysis of complex text types such as scientific journal articles. It is argued that besides lexical and grammatical discourse markers, which have traditionally been employed in discourse parsing, cues derived from the logical and generical document structure and the thematic structure of a text must be taken into account. An approach to modelling such types of linguistic information in terms of XML-based multi-layer annotations and to a text-technological representation of additional knowledge sources is presented. By means of quantitative and qualitative corpus analyses, cues and constraints for automatic discourse analysis can be derived. Furthermore, the proposed representations are used as the input sources for discourse parsing. A short overview of the projected parsing architecture is given.
Discourse segmentation is the division of a text into minimal discourse segments, which form the leaves in the trees that are used to represent discourse structures. A definition of elementary discourse segments in German is provided by adapting widely used segmentation principles for English minimal units, while considering punctuation, morphology, sytax, and aspects of the logical document structure of a complex text type, namely scientific articles. The algorithm and implementation of a discourse segmenter based on these principles is presented, as well an evaluation of test runs.
Two very reliable influences on eye fixation durations in reading are word frequency, as measured by corpus counts, and word predictability, as measured by cloze norming. Several studies have reported strictly additive effects of these 2 variables. Predictability also reliably influences the amplitude of the N400 component in event-related potential studies. However, previous research suggests that while frequency affects the N400 in single-word tasks, it may have little or no effect on the N400 when a word is presented with a preceding sentence context. The present study assessed this apparent dissociation between the results from the 2 methods using a coregistration paradigm in which the frequency and predictability of a target word were manipulated while readers’ eye movements and electroencephalograms were simultaneously recorded. We replicated the pattern of significant, and additive, effects of the 2 manipulations on eye fixation durations. We also replicated the predictability effect on the N400, time-locked to the onset of the reader’s first fixation on the target word. However, there was no indication of a frequency effect in the electroencephalogram record. We suggest that this pattern has implications both for the interpretation of the N400 and for the interpretation of frequency and predictability effects in language comprehension.
Einigendes Band zerfasert?
(1991)
Dropping out of overlap is a frequent practice for overlap resolution (Schegloff, 2000, Jefferson, 2004) in interaction, as it re-establishes the “one-at-a-time” principle of the turn-taking system (Sacks et al., 1974). While it is appropriate to analyze the practice of dropping out of overlap as a verbal and thus audible phenomenon, a close look at video data reveals that withdrawing from an action trajectory is also an embodied practice. Based on a fine-grained multimodal analysis (C. Goodwin, 1981, Mondada, 2007a, Mondada, 2007b) of videotaped interactions in French, this paper illustrates how overlapped speakers organize the momentary suspension of their action trajectory in visible ways. Indeed, participants do not instantly withdraw from their action trajectory when they stop talking. By using bodily resources, they are able to display continuous monitoring of the availability of their co-participants and of the next possible slot for resuming their suspended action. I therefore suggest analyzing the drop out of overlap as the first step of withdrawal, as definitive, embodied withdrawal can occur later, or, in case of resumption, not at all. Consequently, my paper analyzes withdrawal as a good example of strengthening the analytic concept of embodiment with regard to turn-taking practices in interaction.
This paper deals with the constructional variation of emotion predicates in Estonian. It gives an overview on the constructional types, including information of their quantitative distribution. It is shown that one characteristic of Estonian is the formation of pairs of converses, i.e. pairs of emotion verbs, which have the same emotion semantics but different argument realisation patterns. These converses are based on derivational morphology such as the causative morphem –ta ‘CAUS’. Causative derivation has been adduced in the theoretical literature as support for the assumption that the cross-linguistically wide-spread constructional variation in emotion predicates has its origin in a difference of the causal structure in the verbal semantics. This paper shows that the data of Estonian contradicts this assumption.
Cette contribution discute différents enjeux dégagés lors d’une étude des pratiques professionnelles plurilingues : ces enjeux ont émergé d’une analyse menée collaborativement par deux équipes de chercheurs, à Lyon et à Paris, participant au projet européen DYLAN (6e programme cadre) et élaborant ensemble l’analyse empirique d’un extrait d’une réunion de travail, enregistrée dans le cadre d’une collaboration sur un même terrain. Cette analyse est l’occasion de thématiser de manière exemplaire un certain nombre de questions surgissant de l’étude des contacts des langues dans les contextes professionnels, concernant aussi bien les enjeux épistémologiques que l'engagement du chercheur sur le terrain.
We present a technique called event mapping that allows to project text representations into event lists, produce an event table, and derive quantitative conclusions to compare the text representations. The main application of the technique is the case where two classes of text representations have been collected in two different settings (e.g., as annotations in two different formal frameworks) and we can compare the two classes with respect to their systematic differences in the event table. We illustrate how the technique works by applying it to data collected in two experiments (one using annotations in Vladimir Propp’s framework, the other using natural language summaries).
Sexual harassment severely impacts the educational system in the West African country Benin and the progress of women in this society that is characterized by great gender inequality. Knowledge of the belief systems rooting in the sociocultural context is crucial to the understanding of sexual harassment. However, no study has yet investigated how sexual harassment is related to fundamental beliefs in Benin or West African countries. We conducted a field study on 265 female and male students from several high schools in Benin to investigate the link between sexual harassment and measures of ambivalent sexism, gender identity, and rape myth acceptance. Almost half of the sample reported having experienced sexual harassment personally or among peers. Levels of sexism and rape myth acceptance were very high compared to other studies. These attitudes appeared to converge in a sexist belief system that was linked to personal experiences, the perceived probability of experiencing and fear of sexual harassment. Results suggest that sexual harassment is a societal problem and that interventions need to address fundamental attitudes held in societies low in gender equality.
This article deals with three interrelated phenoma in the information structure of German sentences: the focusing of negative markers, of finite verb forms and of the particles ja, doch, wohl and schon. Focusing of the finite verb is the most important marker of verum focus, as described by Höhle (1988). Focusing of particles can be an alternative means for similar purposes, while focusing of negation seems to be the contradictory opposite of verum focus. It is shown that negation- independently of its information structural status - can be interpreted on three distinct levels of sentence meaning: as an indicator of the non-facticity of a state of affairs, the non-truth of a proposition, or the non-desirability of a speech act. Focusing of the negative marker puts contrastive emphasis on the negative value assigned to sentence meaning on one of these levels. Ve rum focus can be interpreted on the same three levels: as a marker of contrastive emphasis on a positive value of facticity, truth or desirability. The particles ja, doch, wohl and schon refer to sufficient epistemic or interactional conditions for the assignment of a positive or negative value. By focusing such a particle, the speaker indicates that (s)he believes the assigned value to be well justified and insists on establishing it as common ground for further interaction.
Psychological research has neglected people whose accent does not match their appearance. Most research on person perception has focused on appearance, overlooking accents that are equally important social cues. If accents were studied, it was often done in isolation (i.e., detached from appearance). We examine how varying accent and appearance information about people affects evaluations. We show that evaluations of expectancy-violating people shift in the direction of the added information. When a job candidate looked foreign, but later spoke with a native accent, his evaluations rose and he was evaluated best of all candidates (Experiment 1a). However, the sequence in which information was presented mattered: When heard first and then seen, his evaluations dropped (Experiment 1b). Findings demonstrate the importance of studying the combination and sequence of different types of information in impression formation. They also allow predicting reactions to ethnically mixed people, who are increasingly present in modern societies.
The multiple gradations of German strong verbs are but manifestations of a rather uncomplicated system. There is a small number of ways to make up ablaut forms; these types of formation are identifiable in formal terms and, what is more, they have definite functions as morphological markers. Using classifications of stem forms according to quality, complexity and quantity of vowels, three types of operations involved in ablaut formation are identified. Ablaut always includes a change of quality type or a change of complexity type, and in addition it may include a change of quantity type. Ablaut forms are clearly distinguished as against bases (and against each other): their vocalism meets a defined standard of dissimilarity. On this basis, gradations are collected into inflectional classes that are defined in strictly synchronic terms. These classes continue the historical seven classes known from reference grammars. For the majority of strong verbs, membership in these classes (and thus ablaut) is predictable.
The vowel quality in some diphthongs of Swabian (an upper german dialect) was determined by measurement of first and second formant values. A minimal contrast could be shown between two different diphthong qualities […], where for Standard German only one is assumed, viz. /ai/. The two diphthong qualities differ only slightly in onset and offset vowel quality, so a better understanding of their relationship was expected from an examination of their dynamic aspects. Our preliminary results suggest that there is indeed a difference in the temporal structure of the two diphthongs.
A polarity-sensitive item (PSI), as traditionally defined, is an expression that is restricted to either an affirmative or negative context. PSIs like ‘lift a finger’ and ‘all the time in the world’ sub-serve discourse routines like understatement and emphasis. Lexical–semantic classes are increasingly invoked in descriptions of the properties of PSIs. Here, we use English corpus data and the tools of Frame Semantics (Fillmore, 1982, 1985) to explore Israel’s (2011) observation that the semantic role of a PSI determines how the expression fits into a contextually constructed scalar model. We focus on a class of exceptions implied by Israel’s model: cases in which a given PSI displays two countervailing patterns of polarity sensitivity, with attendant differences in scalar entailments. We offer a set of case studies of polaritysensitive expressions – including verbs of attraction and aversion like ‘can live without’, monetary units like ‘a red cent’, comparative adjectives and time-span adverbials – that demonstrate that the interpretation of a given PSI in a given polar context is based on multiple factors. These factors include the speaker’s perspective on and affective stance towards the described event, available inferences about causality and, perhaps most critically, particulars of the predication, including the verb or adjective’s frame membership, the presence or absence of an ability modal like can, the grammatical construction used and the range of contingencies evoked by the utterance.
In her overview, Margret Selting makes the case for the claim that dealing with authentic conversation necessarily lies at the heart of an interactionallinguistic approach to prosody (see Selting this volume, Section 3.3). However, collecting and transcribing corpora of authentic interaction is a time-consuming enterprise. This fact often severely restricts what the individual researcher is able to do in terms of analysis within the scope of his or her resources. Still, for dealing with many of the desiderata Margret Selting points out in Section 5 of her extensive overview, the use of larger corpora seems to be required. In this commenting paper, I want to argue that future progress in research on prosody in interaction will essentially rest on the availability and use of large public corpora. After reviewing arguments for and against the use of public corpora, I will discuss some upshots regarding corpus design and issues of transcription of public corpora.
Generierung von Linkangeboten zur Rekonstruktion terminologiebedingter Wissensvoraussetzungen
(2002)
Dieser Beitrag skizziert Strategien zur (semi-)automatischen Annotation von definitorischen Textsegmenten und Termverwendungsinstanzen auf der Grundlage grammatisch annotierter Korpora. Ziel unserer Überlegungen ist es, bei der selektiven Rezeption von Fachtexten in einer Hypertextumgebung die je spezifischen Wissensvoraussetzungen, die der Verwendung von Fachtermini unterliegen und die für das Textverständnis eine entscheidende Rolle spielen, über automatisch generierte Linkangebote rekonstruierbar zu machen.
This paper investigates synchronic variation in the lexical and grammatical environments of the German lexical verb verdienen ‘earn’, ‘deserve’. In its lexical uses, verdienen co-occurs with an object noun phrase whose head is either concrete (e.g. Geld ‘money’) or, more commonly, abstract (e.g. Beachtung ‘attention’). When it is used more grammatically with deontic modal meaning, verdienen is followed by a passive or active infinitive. This paper uses collostructional analyses to contrast lexical and grammatical uses in terms of the most strongly attracted lexical items, which are grouped into semantic classes. The results reflect different degrees of host-class expansion (cf. Himmelmann 2004), whereby the collexemes of verdienen expand from concrete to abstract and their morpho-syntactic contexts from nominal to infinitival complement and subsequently from passive to active. Synchronic distribution can thus serve as a window on diachronic development (Kuteva 2001), in this case the rise of a deontic modality marker.
The project elexiko compiles an extensive, monolingual dictionary of Contemporary German. This contribution deals with the grammatical data in this dictionary; it is not only described how these are arranged content-wise depending on corpus data, but also how they were modelled.
Das Projekt elexiko erarbeitet ein umfangreiches, einsprachiges Wörterbuch des Gegenwartsdeutschen. In diesem Beitrag geht es um die grammatischen Angaben in diesem Wörterbuch; es wird nicht nur erläutert, wie diese inhaltlich in Abhängigkeit vom Prinzip der Korpusbasiertheit gestaltet sind, sondern auch, wie sie modelliert wurden.
Durch den Dezentralisierungsprozess in Großbritannien gibt es seit etwa einem Jahr neue Hoffnung für ein dauerhaftes Überleben der gälischen Sprache in Schottland. Mit der Einrichtung eines schottischen Parlaments, das seit Mai 1999 für innere Belange Schottlands verantwortlich ist, ist die gälischsprachige Bevölkerung viel näher an das Machtzentrum heran gerückt.
In this chapter, emotions are not regarded primarily as internal-psychological phenomena, but as socially proscribed and formed entities, which are constituted in accordance with social rules of emotionality and which are manifested, interpreted, and processed together communicatively in the interaction for definite purposes by the persons involved. In the elaboration of such an interactive conception of emotionality, the following aspects are treated: the value of emotionality in linguistic theories; emotions as a specific form of experiencing; the rules of emotionality; communication of emotions as transmission of evaluations; practices of manifestation, interpretation and processing of emotions in the communication process; fundamental interrelations between emotions and communication behavior; and methodology of the analysis of emotions and emotionality in specific conversation types. Finally, the developed theoretical apparatus in the analysis of two short conversation sections is elucidated.
This article advocates an understanding of ‘positioning’ as a key to the analysis of identities in interaction within the methodological framework of conversation analysis. Building on research by Bamberg, Georgakopoulou and others, a performative, interaction-based approach to positioning is outlined and compared to membership categorization analysis. An interactional episode involving mock stories to reveal and reproach an inadequate identity-claim of a co-participant is analysed both in terms of practices of membership categorization and positioning. It is concluded that membership categorization is a core element of positioning. Still, positioning goes beyond membership categorization in a) revealing biographical dimensions accomplished by narration and b) by uncovering implicit performative claims of identity, which are not established by categorization or description.
How to propose an action as an objective necessity. The case of Polish trzeba x (‘one needs to x’)
(2011)
The present study demonstrates that language-specific grammatical resources can afford speakers language-specific ways of organizing cooperative practical action. On the basis of video recordings of Polish families in their homes, we describe action affordances of the Polish impersonal modal declarative construction trzeba x (“one needs to x”) in the accomplishment of everyday domestic activities, such as cutting bread, bringing recalcitrant children back to the dinner table, or making phone calls. Trzeba-x turns in first position are regularly chosen by speakers to point to a possible action as an evident necessity for the furthering of some broader ongoing activity. Such turns in first position provide an environment in which recipients can enact shared responsibility by actively involving themselves in the relevant action. Treating the necessity as not restricted to any particular subject, aligning responsive actions are oriented to when the relevant action will be done, not whether it will be done. We show that such sequences are absent from English interactions by analyzing (a) grammatically similar turn formats in English interaction (“we need to x,” “the x needs to y”), and (b) similar interactive environments in English interactions. We discuss the potential of this research to point to a new avenue for researchers interested in the relationship between language diversity and diversity in human action and cognition.
Der Kurzbeitrag berichtet über ein Projekt ”Hypertextualisierung auf textgrammatischer Grundlage“ (HyTex), in dem erforscht wird, wie sich linear organisierte Dokumente mit semiautomatischen Methoden auf der Grundlage von textgrammatischem Markup und der linguistisch motivierten Modellierung terminologischen Wissens in delinearisierte Hyperdokumente überführen lassen. Ziel ist es, eine Sammlung von Fachtexten so in einen Hypertext zu überführen, dass terminologiebedingte Verständnisschwierigkeiten beim Lesen durch entsprechende Linkangebote aufgelöst werden, so dass die Fachtexte auch von Semi-Experten der Domäne selektiv gelesen werden können. Der Schwerpunkt des Beitrags liegt auf der Modellierung terminologischen Wissens mit XML Topic Maps und dessen Stellenwert für die automatische Erzeugung von Hyperlinks.
Dieses Kapitel befasst sich mit dem Zusammenspiel von Raum und Interaktion und konzentriert sich auf die dynamischen Organisationsformen sozialer Handlungen unter Berücksichtigung verbaler und sichtbarer Ressourcen. Durch die Untersuchung eines spezifischen Settings – professionelle Interaktionen in einem Radiostudio – werden wir empirisch beschreiben und konzeptualisieren, wie ein gebauter bzw. stark architekturierter Raum im Rahmen institutioneller Praktiken genutzt und relevant gesetzt wird. So soll zu aktuellen Überlegungen zu Interaktionsraum und -architektur, zu Raum als Ressource sowie als materiellem Umfeld beigetragen werden. Unsere ethnomethodologische und konversationsanalytische Perspektive wird von aktuellen Debatten über den sogenannten spatial turn in der interaktionalen Forschung beeinflusst (Kap. 1.1). Auf Grundlage eines in einem Radiostudio erstellten Videokorpus (Kap. 1.2) wird zunächst die Verbindung zwischen einem architektonisch und technologisch komplexen Umfeld und dem interaktionalen Handeln der Teilnehmer skizziert (Kap. 2.1, Kap. 2.2). Es folgt die detaillierte Analyse eines Einzelfalls (Kap. 3), in dem die Radiomoderatoren einen Text für den nächsten Sendeabschnitt vorbereiten. Hier werden die räumlichen Charakteristika sichtbar, die bei der Arbeit nach und nach relevant gesetzt werden (Kap. 4).
In this paper we present work in developing a computerized grammar for the Latin language. It demonstrates the principles and challenges in developing a grammar for a natural language in a modern grammar formalism. The grammar presented here provides a useful resource for natural language processing applications in different fields. It can be easily adopted for language learning and use in language technology for Cultural Heritage like translation applications or to support post-correction of document digitization.
Automatic summarization systems usually are trained and evaluated in a particular domain with fixed data sets. When such a system is to be applied to slightly different input, labor- and cost-intensive annotations have to be created to retrain the system. We deal with this problem by providing users with a GUI which allows them to correct automatically produced imperfect summaries. The corrected summary in turn is added to the pool of training data. The performance of the system is expected to improve as it adapts to the new domain.
Integrated Linguistic Annotation Models and Their Application in the Domain of Antecedent Detection
(2011)
Seamless integration of various, often heterogeneous linguistic resources in terms of their output formats and a combined analysis of the respective annotation layers are crucial tasks for linguistic research. After a decade of concentration on the development of formats to structure single annotations for specific linguistic issues, in the last years a variety of specifications to store multiple annotations over the same primary data has been developed. The paper focuses on the integration of the knowledge resource logical document structure information into a text document to enhance the task of automatic anaphora resolution both for the task of candidate detection and antecedent selection. The paper investigates data structures necessary for knowledge integration and retrieval.
In this contribution we present some work of the R&D European project “LIRICS” and of the ISO/TC 37/SC 4 committee related to the topic of interoperability and re-use of language resources. We introduce some basic mechanisms of the standardization work in ISO and describe in more details the general approach on how to cope with the annotation of language data within ISO.
Introduction
(2010)
Researchers in many disciplines, sometimes working in close cooperation, have been concerned with modeling textual data in order to account for texts as the prime information unit of written communication. The list of disciplines includes computer science and linguistics as well as more specialized disciplines like computational linguistics and text technology. What many of these efforts have in common is the aim to model textual data by means of abstract data types or data structures that support at least the semi-automatic processing of texts in any area of written communication.
In der akademischen Diskussion zum Global English hat sich seit den 1980er Jahren ein Modell etabliert, das die Staaten, in denen Englisch gesprochen wird, idealtypisch in drei Kreise einteilt: Den Inneren Kreis, in dem Englisch wichtigste Sprache der Gesellschaft sowie L1 eines Großteils der Bevölkerung ist, den Äußeren Kreis, wo Englisch L2 und eine wichtige Sprache unter mehreren ist, sowie den Erweiterten oder Expandierenden Kreis, in dem Englisch als Fremdsprache und als Lingua Franca dominiert (Kachru, 1985). Dieser Beitrag zeigt anhand einer Bestandsaufnahme gesellschaftlicher Funktionen des Deutschen weltweit, dass dieses Modell auch auf das Deutsche übertragen werden kann. Allerdings unterscheidet sich das Deutsche in einigen erheblichen Aspekten vom Englischen: Zum Inneren Kreis gehören die Länder des deutschsprachigen Kerngebietes, zum Äußeren Kreis Länder, in denen Deutsch anerkannte Minderheitensprache ist, und zum Erweiterten (oder im Falle des Deutschen eher Bröckelnden) Kreis Länder, in denen es einzelne deutsche Sprachinseln oder eine deutschsprachige Diaspora gibt, wobei letztere auch erst in jüngster Zeit entstanden sein kann. Schließlich diskutiert der Aufsatz die Position des Baltikums in diesem Modell.
Social perception studies have revealed that smiling individuals are perceived more favourably on many communion dimensions in comparison to nonsmiling individuals. Research on gender differences in smiling habits showed that women smile more than men. In our study, we investigated this phenomena further and hypothesised that women perceive smiling individuals as more honest than men. An experiment conducted in seven countries (China, Germany, Mexico, Norway, Poland, Republic of South Africa and USA) revealed that gender may influence the perception of honesty in smiling individuals. We compared ratings of honesty made by male and female participants who viewed photos of smiling and nonsmiling people. While men and women did not differ on ratings of honesty in nonsmiling individuals, women assessed smiling individuals as more honest than men did. We discuss these results from a social norms perspective.