OPUS 4 | Search

"Standard usage" : towards a realistic conception of spoken standard German (2013)

Deppermann, Arnulf ; Kleiner, Stefan ; Knöbl, Ralf

"Standard language" is a contested concept, ideologically, empirically and theoretically. This is particularly true for a language such as German, where the standardization of the spoken language was based on the written standard and was established with respect to a communicative situation, i.e. public speech on stage (Bühnenaussprache), which most speakers never come across. As a consequence, the norms of the oral standard exhibit many features which are infrequent in the everyday speech even of educated speakers. This paper discusses ways to arrive at a more realistic conception of (spoken) standard German, which will be termed "standard usage". It must be founded on empirical observations of speakers linguistic choices in everyday situations. Arguments in favor of a corpus-based notion of standard have to consider sociolinguistic, political, and didactic concerns. We report on the design of a large study of linguistic variation conducted at the Institute for the German Language (project "Variation in Spoken German", Variation des gesprochenen Deutsch) with the aim of arriving at a representative picture of "standard usage" in contemporary German. It systematically takes into account both diatopic variation covering the multi-national space in which German an official language, and diastratic variation in terms of varying degrees of formality. Results of the study of phonetic and morphosyntactic variation are discussed. At least for German, a corpus-based notion of "standard usage" inevitably includes some degree of pluralism concerning areal variation, and it needs to do justice to register-based variation as well.

"Watch out, organize, inform yourself!": Tracing the Dynamics of Twitter Discourse on Anti-Nazi Street Protests (2013)

Dang-Anh, Mark ; Eble, Michael

With the advent of mobile devices, mediatized political discourse became more dynamic. I assume that the microblog Twitter can be considered as a medium for spatial coordination during protests. Therefore, the case of neo-Nazi demonstrations and counter-protests in the city of Dresden that occurred in February 2012 is analysed. Data consists of microposts that occurred during the event. Quantitative analysis of hashtag and retweet frequencies was performed as well as qualitative speech act pattern analysis and a tempo-spatial discourse analysis on selected subsets of microposts. Results show that a common linguistic practice is verbal georeferencing and by that constructing space. Empirical analysis indicates a strong relation between communicational online space and physical offline place: Protest participants permanently reconfigure spatial context discursively and thus the contested protest area becomes a temporarily meaningful place.

A Paradigm for Eliciting Story Variation (2013)

Fisseni, Bernhard ; Lawrence, Faith

The understanding of story variation, whether motivated by cultural currents or other factors, is important for applications of formal models of narrative such as story generation or story retrieval. We present the first stage of an experiment to elicit natural narrative variation data suitable for evaluation with respect to story similarity, to qualitative and quantitative analysis of story variation, and also for data processing. We also present few preliminary results from the first stage of the experiment, using Red Riding Hood and Romeo and Juliet as base texts.

A tale of many stories: explaining policy diffusion between European higher education systems (2013)

Heinze, Torben

The thesis ”A Tale of Many Stories - Explaining Policy Diffusion between European Higher Education Systems" systematically examines diffusion processes and their effects with regard to a rather neglected policy area – the case of European higher education policy. The thesis contributes to the slowly growing number of comparative and mechanism-based studies on policy diffusion and represents the first study on the diffusion of policies between European Higher Education Systems. The main aim is to contrast and compare testable and coherent explanatory models on the functioning of different diffusion mechanisms. Three sets of explanatory models on the relationship between variables triggering and conditioning diffusion mechanisms and their impact on policy adoption are drawn from mechanism-based thinking on policy diffusion: on learning, socialization, and externalities. These approaches conceptualize the policy process in terms of interdependencies between international and national actors. Explanatory models based on assumptions about domestic policies and the common responses of countries to similar policy problems extend this theoretical framework. The thesis is based on event history modelling of policy change and adoption in higher education systems of 16 West European countries between the yeas 1980 and 1998. Overall 14 policy items describing performance-orientated reforms for public universities ranging from the adoption of external quality assurance systems to tuition fees are examined. Empirically, the main research question is what international, national and policy-specific factors cause and condition diffusion processes and the adoption of public policies? Evidence can be found for and against all of the four theoretical approaches tested. In comparison, many of the assumptions related to interdependencies lack robustness, whereas the common response model is the most stable one. This does not mean that explanatory models based on interdependent decision-making are not suitable for analysing policy diffusion in higher education. Rather interdependency is a multi- dimensional concept that requires a comparative assessment of diffusion mechanisms. Some of explanatory factors based on interdependent decision- making are still supported by the empirical analysis though. From this point of view, the recommendation for analysing diffusion is to start with a model based on domestic politics, that is successively extended by explanatory factors dealing with interdependencies between international and national actors. Diffusion variables matter – but it is only one side of the tale on policy diffusion.

Advanced graph-based searches in an Internet dictionary portal (2013)

Meyer, Peter

The web portal Lehnwortportal Deutsch (lwp.ids-mannheim.de), developed at the Institute for the German Language (IDS), aims to provide unified access to existing and possibly new dictionaries of German loanwords in other languages. Internally, the lexicographical information is represented as a directed acyclic graph of relations between words. The graph abstracts from the idiosyncrasies of the individual component dictionaries. This paper explores two different strategies to make complex graph-based cross-dictionary queries in such a portal more accessible to users. The first strategy effectively hides the underlying graph structure, but allows users to assign scopes (internally defined in terms of the graph structure) to search criteria. A second type of search strategy directly formulates queries in terms of the relational graph structure. In this case, search results are not entries but n-tuples of words (metalemmata, loanwords, etyma); a query consists of specifying properties of these words and relations between them. A working prototype of an easy-to-use human-readable declarative query language is presented and ways to interactively construct queries are discussed.

Automatic recognition of speech, thought, and writing representation in German narrative texts (2013)

Brunner, Annelen

This article presents the main results of a project, which explored ways to recognize and classify a narrative feature—speech, thought, and writing representation (ST&WR)—automatically, using surface information and methods of computational linguistics. The task was to detect and distinguish four types—direct, free indirect, indirect, and reported ST&WR—in a corpus of manually annotated German narrative texts. Rule-based as well as machine-learning methods were tested and compared. The results were best for recognizing direct ST&WR (best F1 score: 0.87), followed by indirect (0.71), reported (0.58), and finally free indirect ST&WR (0.40). The rule-based approach worked best for ST&WR types with clear patterns, like indirect and marked direct ST&WR, and often gave the most accurate results. Machine learning was most successful for types without clear indicators, like free indirect ST&WR, and proved more stable. When looking at the percentage of ST&WR in a text, the results of machine-learning methods always correlated best with the results of manual annotation. Creating a union or intersection of the results of the two approaches did not lead to striking improvements. A stricter definition of ST&WR, which excluded borderline cases, made the task harder and led to worse results for both approaches.

Benefactive construction (2013)

Proost, Kristel

Bootstrapping polarity classifiers with rule-based classification (2013)

Wiegand, Michael ; Klenner, Manfred ; Klakow, Dietrich

In this article, we examine the effectiveness of bootstrapping supervised machine-learning polarity classifiers with the help of a domain-independent rule-based classifier that relies on a lexical resource, i.e., a polarity lexicon and a set of linguistic rules. The benefit of this method is that though no labeled training data are required, it allows a classifier to capture in-domain knowledge by training a supervised classifier with in-domain features, such as bag of words, on instances labeled by a rule-based classifier. Thus, this approach can be considered as a simple and effective method for domain adaptation. Among the list of components of this approach, we investigate how important the quality of the rule-based classifier is and what features are useful for the supervised classifier. In particular, the former addresses the issue in how far linguistic modeling is relevant for this task. We not only examine how this method performs under more difficult settings in which classes are not balanced and mixed reviews are included in the data set but also compare how this linguistically-driven method relates to state-of-the-art statistical domain adaptation.

Constructional ‘scene encoding’ and acquisition: Mothers’ use of argument structure constructions in English child-directed speech (2013)

Zeschel, Arne

Construction-based language models assume that grammar is meaningful and learnable from experience. Focusing on five of the most elementary argument structure constructions of English, a large-scale corpus study of child-directed speech (CDS) investigates exactly which meanings/functions are associated with these patterns in CDS, and whether they are indeed specially indicated to children by their caretakers (as suggested by previous research, cf. Goldberg, Casenhiser and Sethuraman 2004). Collostructional analysis (Stefanowitsch and Gries 2003) is employed to uncover significantly attracted verb-construction combinations, and attracted pairs are classified semantically in order to systematise the attested usage patterns of the target constructions. The results indicate that the structure of the input may aid learners in making the right generalisations about constructional usage patterns, but such scaffolding is not strictly necessary for construction learning: not all argument structure constructions are coherently semanticised to the same extent (in the sense that they designate a single schematic event type of the kind envisioned in Goldberg’s [1995] ‘scene encoding hypothesis’), and they also differ in the extent to which individual semantic subtypes predominate in learners’ input

Contexts of dictionary use (2013)

Müller-Spitzer, Carolin

To design effective electronic dictionaries, reliable empirical information on how dictionaries are actually being used is of great value for lexicographers. To my knowledge, no existing empirical research addresses the context of dictionary use, or the extra-lexicographic situations in which a dictionary consultation is embedded. This is mainly due to the fact that data about these contexts is difficult to obtain. To take a first step in closing this research gap, I incorporated an open-ended question (“In which contexts or situations would you use a dictionary?”) into the online survey (N = 684) and asked the participants to answer this question by providing as much information as possible. Instead of presenting well-known facts about standardized types of usage situation, this paper will focus on the more offbeat circumstances of dictionary use and aims of users, as they are reflected in the responses. Overall, the results indicate that there is a community whose work is closely linked with dictionaries and, accordingly, they deal very routinely with this type of text. Dictionaries are also seen as a linguistic treasure trove for games or crossword puzzles, and as a standard which can be referred to as an authority. While it is important to emphasize that the results are only preliminary, they do indicate the potential of empirical research in this area.

Data Formats for Phonological Corpora (2013)

Romary, Laurent ; Witt, Andreas

The goal of the present chapter is to explore the possibility of providing the research (but also the industrial) community that commonly uses spoken corpora with a stable portfolio of well-documented standardized formats that allow a high reuse rate of annotated spoken resources and, as a consequence, better interoperability across tools used to produce or exploit such resources.

Decision Tree-Based Evaluation of Genitive Classification – An Empirical Study on CMC and Text Corpora. Language Processing and Knowledge in the Web (2013)

Hansen, Sandra ; Schneider, Roman

Contemporary studies on the characteristics of natural language benefit enormously from the increasing amount of linguistic corpora. Aside from text and speech corpora, corpora of computer-mediated communication (CMC) Position themselves between orality and literacy, and beyond that provide in- sight into the impact of "new", mainly intemet-based media on language beha- viour. In this paper, we present an empirical attempt to work with annotated CMC corpora for the explanation of linguistic phenomena. In concrete terms, we implement machine leaming algorithms to produce decision trees that reveal rules and tendencies about the use of genitive markers in German.

Decision tree-based evaluation of genitive classification. An empirical study on CMC and text corpora (2013)

Hansen, Sandra ; Schneider, Roman

Contemporary studies on the characteristics of natural language benefit enormously from the increasing amount of linguistic corpora. Aside from text and speech corpora, corpora of computer-mediated communication (CMC) position themselves between orality and literacy, and beyond that provide insight into the impact of “new”, mainly internet-based media on language behaviour. In this paper, we present an empirical attempt to work with annotated CMC corpora for the explanation of linguistic phenomena. In concrete terms, we implement machine learning algorithms to produce decision trees that reveal rules and tendencies about the use of genitive markers in German.

Defining a gold standard for polar intensity ordering (2013)

Brandes, Jasper ; Ruppenhofer, Josef

In this paper, we report on an effort to develop a gold standard for the intensity ordering of subjective adjectives. Rather than pursue a complete order as produced by paying attention to the mean scores of human ratings only, we take into account to what extent assessors consistently rate pairs of adjectives relative to each other. We show that different available automatic methods for producing polar intensity scores produce results that correlate well with our gold standard, and discuss some conceptual questions surrounding the notion of polar intensity.

Designing a bilingual speech corpus for French and German language learners (2013)

Trouvain, Jürgen ; Laprie, Yves ; Möbius, Bernd ; Andreeva, Bistra ; Bonneau, Anne ; Colotte, Vincent ; Fauth, Camille ; Fohr, Dominique ; Jouvet, Denis ; Mella, Odile ; Jügler, Jeanin ; Zimmerer, Frank

Dictionary portals (2013)

Engelberg, Stefan ; Müller-Spitzer, Carolin

Discussing best practices for the annotation of Twitter microtext (2013)

Rehbein, Ines ; Visser, Emiel ; Lestmann, Nadine

This paper contributes to the discussion on best practices for the syntactic analysis of non-canonical language, focusing on Twitter microtext. We present an annotation experiment where we test an existing POS tagset, the Stuttgart-Tübingen Tagset (STTS), with respect to its applicability for annotating new text from the social media, in particular from Twitter microblogs. We discuss different tagset extensions proposed in the literature and test our extended tagset on a set of 506 tweets (7.418 tokens) where we achieve an inter-annotator agreement for two human annotators in the range of 92.7 to 94.4 (k). Our error analysis shows that especially the annotation of Twitterspecific phenomena such as hashtags and at-mentions causes disagreements between the human annotators. Following up on this, we provide a discussion of the different uses of the @- and #-marker in Twitter and argue against analysing both on the POS level by means of an at-mention or hashtag label. Instead, we sketch a syntactic analysis which describes these phenomena by means of syntactic categories and grammatical functions.

Editorial. Positioning in narrative interaction (2013)

Deppermann, Arnulf

Embodied withdrawal after overlap resolution (2013)

Oloff, Florence

Dropping out of overlap is a frequent practice for overlap resolution (Schegloff, 2000, Jefferson, 2004) in interaction, as it re-establishes the “one-at-a-time” principle of the turn-taking system (Sacks et al., 1974). While it is appropriate to analyze the practice of dropping out of overlap as a verbal and thus audible phenomenon, a close look at video data reveals that withdrawing from an action trajectory is also an embodied practice. Based on a fine-grained multimodal analysis (C. Goodwin, 1981, Mondada, 2007a, Mondada, 2007b) of videotaped interactions in French, this paper illustrates how overlapped speakers organize the momentary suspension of their action trajectory in visible ways. Indeed, participants do not instantly withdraw from their action trajectory when they stop talking. By using bodily resources, they are able to display continuous monitoring of the availability of their co-participants and of the next possible slot for resuming their suspended action. I therefore suggest analyzing the drop out of overlap as the first step of withdrawal, as definitive, embodied withdrawal can occur later, or, in case of resumption, not at all. Consequently, my paper analyzes withdrawal as a good example of strengthening the analytic concept of embodiment with regard to turn-taking practices in interaction.

Enriching FrameNet with Scalar Information (2013)

Ruppenhofer, Josef ; Brandes, Jasper

Exploring sexual harassment and related attitudes in Beninese high schools: a field study (2013)

Waubert de Puiseau, Berenike ; Roessel, Janin

Sexual harassment severely impacts the educational system in the West African country Benin and the progress of women in this society that is characterized by great gender inequality. Knowledge of the belief systems rooting in the sociocultural context is crucial to the understanding of sexual harassment. However, no study has yet investigated how sexual harassment is related to fundamental beliefs in Benin or West African countries. We conducted a field study on 265 female and male students from several high schools in Benin to investigate the link between sexual harassment and measures of ambivalent sexism, gender identity, and rape myth acceptance. Almost half of the sample reported having experienced sexual harassment personally or among peers. Levels of sexism and rape myth acceptance were very high compared to other studies. These attitudes appeared to converge in a sexist belief system that was linked to personal experiences, the perceived probability of experiencing and fear of sexual harassment. Results suggest that sexual harassment is a societal problem and that interventions need to address fundamental attitudes held in societies low in gender equality.

Extending the possibilities for collaborative work with TEI/XML through the usage of a wiki system (2013)

Entrup, Bastian ; Binder, Frank ; Lobin, Henning

This paper presents and discusses an integrated project-specific working environment for editing TEI/XML-files and linking entities of interest to a dedicated wiki system. This working environment has been specifically tailored to the workflow in our interdisciplinary digital humanities project GeoBib. It addresses some challenges that arose while working with person-related data and geographical references in a growing collection of TEI/XML-files. While our current solution provides some essential benefits, we also discuss several critical issues and challenges that remain.

Freezing in it-clefts (2013)

Hartmann, Jutta M.

Freezing is the cover term for the restriction on extraction from constituents in a derived position. The traditional Freezing cases are illustrated here with topicalization in (1a), heavy-NP shift in (1b), and extraposition in (1c).

German in Samoa: Historical traces of a colonial variety (2013)

Stolberg, Doris

During the brief era of German colonialism in the Pacific (1884-1914), German was in contact with a large number of languages, autochthonous as well as colonial ones. This setting led to language contact in which German influenced and was influenced by various languages. In 1900, Western Samoa came under German colonial rule. The German language held a certain prestige there which is mirrored by the numbers of voluntary Samoan learners of German. On the other hand, the preferred use of English, rather than German, by native speakers of German was frequently noted. This paper examines linguistic and metalinguistic data that suggest the historical existence of (the precursor of) a colonial variety of German as spoken in Samoa. This variety seems to have been marked mainly by lexical borrowing from English and Samoan and was, because of these borrowings, not fully comprehensible to Germans who had never encountered the variety or the colonial setting in Samoa. It is discussed whether this variety can be considered a separate variety of German on linguistic grounds.

How to get a grip on identities-in-interaction. (What) Does "Positioning" offer more than "Membership Categorization"? Evidence from a mock story (2013)

Deppermann, Arnulf

This article advocates an understanding of ‘positioning’ as a key to the analysis of identities in interaction within the methodological framework of conversation analysis. Building on research by Bamberg, Georgakopoulou and others, a performative, interaction-based approach to positioning is outlined and compared to membership categorization analysis. An interactional episode involving mock stories to reveal and reproach an inadequate identity-claim of a co-participant is analysed both in terms of practices of membership categorization and positioning. It is concluded that membership categorization is a core element of positioning. Still, positioning goes beyond membership categorization in a) revealing biographical dimensions accomplished by narration and b) by uncovering implicit performative claims of identity, which are not established by categorization or description.

Igel: Comparing document grammars using XQuery (2013)

Sperberg-McQueen, Christopher M. ; Schonefeld, Oliver ; Kupietz, Marc ; Lüngen, Harald ; Witt, Andreas

Igel is a small XQuery-based web application for examining a collection of document grammars; in particular, for comparing related document grammars to get a better overview of their differences and similarities. In its initial form, Igel reads only DTDs and provides only simple lists of constructs in them (elements, attributes, notations, parameter entities). Our continuing work is aimed at making Igel provide more sophisticated and useful information about document grammars and building the application into a useful tool for the analysis (and the maintenance!) of families of related document grammars

IGGSA-STEPS: Shared Task on Source and Target Extraction from Political Speeches (2013)

Ruppenhofer, Josef ; Struß, Julia Maria ; Sonntag, Jonathan ; Grindl, Stefan

In this paper, we report on the definition of a shared task considering source (whose opinion?) and target (about what?) extraction in protocols of the Swiss parliament that will be conducted by the Interest Group on German Sentiment Analysis (IGGSA)1.

Introduction (2013)

Stickel, Gerhard

Investigating the role of information structure triggers (2013)

Hartmann, Jutta M.

Joint Digital Storytelling on Twitter: Creative Appropriation in Political Deliberation (2013)

Thimm, Caja ; Einspänner, Jessica ; Dang-Anh, Mark

This paper explores on the basis of empirical research, how patterns of interaction and argumentation in political discourse on Twitter evolve as translocal communities in the creative shape of “joint digital storytelling”. Joint storytelling embraces coordinated activities by multiple actors focusing on a shared topic. By adding personal information and evaluation, participants construct an open narrative format, which can be inviting and inspiring for others, who then join in with their own narratives. This model will be exemplified by analyzing a large amount of tweets (107,000) collected during a political conflict between proponents and adversaries of a local traffic project in Germany. Analysis is based on (1) the textual level, (2) the operative level (hashtags, @- and RT-Symbol, hyperlinks etc.) and (3) the visual level of storytelling (embedded photos, videos). Results show a new way of creating translocal online communities and political deliberation.

KoGra-DB: Using MapReduce for language corpora (2013)

Schneider, Roman

Linguistic query systems are special purpose IR applications. We present a novel state-of-the-art approach for the efficient exploitation of very large linguistic corpora, combining the advantages of relational database management systems (RDBMS) with the functional MapReduce programming model. Our implementation uses the German DEREKO reference corpus with multi-layer linguistic annotations and several types of text-specific metadata, but the proposed strategy is language-independent and adaptable to large-scale multilingual corpora.

Kommunikationsverben in OWID : an online reference work of German communication verbs with advanced access structures (2013)

Müller-Spitzer, Carolin ; Proost, Kristel

Kommunikationsverben, an online reference work on German communication verbs and part of the dictionary portal OWID, describes the meaning of communication verbs on two levels: a lexical level, represented in the dictionary entries and by sets of lexical features, and a conceptual level, represented by different types of situations referred to by specific types of verbs. These two levels have each been implemented in special types of access structures. A first explorative access to the conceptual level provides the user with a list of the main classes of communication verbs, the subclasses of each of these, and the lexical fields pertaining to each subclass. Lexical fields are presented together with a characterisation of the situation type to which the verbs of that field are used to refer. Information about the conceptual level is additionally accessible by an advanced search option allowing the user to combine components of the characterisation of situation types to “create” any kind of situation and search for the verbs that correspond to it. Information about the lexical level of the meaning of communication verbs is accessible via the dictionary entries and by another advanced search option allowing the user to search for verbs with particular lexical features or combinations of these.

KorAP: the new corpus analysis platform at IDS Mannheim (2013)

Bański, Piotr ; Bingel, Joachim ; Diewald, Nils ; Frick, Elena ; Hanl, Michael ; Kupietz, Marc ; Pȩzik, Piotr ; Schnober, Carsten ; Witt, Andreas

The KorAP project (“Korpusanalyseplattform der nächste Generation”, “Corpus-analysis platform of the next generation”), carried out at the Institut fUr Deutsche Sprache (IDS) in Mannheim, Germany, has as its goal the development of a modem, state-of-the-art corpus-analysis platform, capable of handling very large corpora and opening the perspectives for innovative linguistic research. The platform will facilitate new linguistic findings by making it possible to manage and analyse extremely large amounts of primary data and annotations, while at the same time allowing an undistorted view of the primary un-annotated text, and thus fully satisfying expectations associated with a scientific tool. The project started in July 2011 and is funded till June 2014. The demo presentation in December will be the first version following a preliminary feature freeze, and will open the alpha testing phase of the project.

Lexical, corpus-methodological and lexicographic approaches to paronyms (2013)

Storjohann, Petra

Linking rule (2013)

Proost, Kristel

Multilingual practices in professional settings. Keeping the delicate balance between progressivity and intersubjectivity (2013)

Markaki, Vassiliki ; Merlino, Sara ; Mondada, Lorenza ; Oloff, Florence ; Traverso, Véronique

Drawing on naturalistic video and audio recordings of international meetings, and within the framework of conversation analysis, ethnomethodology and interactional linguistics, this chapter studies how multilingual resources are mobilized in social interactions among professionals, how available linguistic and embodied resources are identified and used by the participants, which solutions are locally elaborated by them when they are confronted with various languages spoken but not shared among them, and which definition of multilingualism they adopt for all practical purposes. Focusing on the multilingual solutions emically elaborated in international professional meetings, we show that the participants orient to a double principle: on the one hand, they orient to the progressivity of the interaction, adopting all the possible resources that enable them to go on within the current activity; on the other hand, they orient to the intersubjectivity of the interaction, treating, preventing and repairing possible troubles and problems of understanding. Specific multilingual solutions can be adopted to keep this difficult balance between progressivity and intersubjectivity; they vary according to the settings, the competences at hand, the linguistic and embodied resources locally defined by the participants as publicly available, the multilingual resources treated as totally or partially shared, as transparent or opaque, and as needing repair or not. The paper begins by sketching the analytical framework, including the methodology and the data collected; it then presents some general findings, before offering an analysis of various ways in which participants keep the balance between progressivity and intersubjectivity in different multilingual interactional contexts.

Multimodal interaction from a conversation analytic perspective (2013)

Deppermann, Arnulf

This special issue of the Journal of Pragmatics has its origins in the International Conference on Conversation Analysis 10 (ICCA10), which took place in Mannheim (Germany) in July 2010. More than 650 scholars attended the conference, whose theme was ‘‘multimodal interaction’’. This volume includes papers based on the four plenary talks given at ICCA10 and four additional contributions related to the conference theme.

New methods in historical corpora (2013)

Investigating the history of a language depends on fragmentary sources, but electronic corpora offer the possibility of alleviating the problem of ‘bad data’. However they cannot overcome it totally, and crucial questions thus arise of the optimal architecture for such a corpus, the problem of how representative even a large corpus can be of actual language use at a particular time, and how a historical corpus can best be annotated and provided with tools to maximize its usefulness as a resource for future researchers. Immense strides have been made in recent years in addressing these questions, with exciting new methods and technological advances. The papers in this volume, which were presented at a conference on New Methods in Historical Corpora (Manchester 2011), exemplify the range of these developments in investigating the diachrony of languages as distinct as English, German, Latin, Spanish, French and Slovene and developing appropriate tools for the analysis of historical corpora in these languages.

Noun phrase construction (2013)

Proost, Kristel

On Latin nominal inflection: the form-function relationship (2013)

Wiese, Bernd

The present paper provides a new approach to the form-function relation in Latin declension. First, inflections are discussed from a functional point of view with special consideration to questions of syncretism. A case hierarchy is justified for Latin that conforms to general observations on case systems. The analysis leads to a markedness scale that provides a ranking of case-number-combinations from unmarked to most marked. Systematic syncretism always applies to contiguous sections of the case-number-scale (‘syncretism fields’). Second, inflections are analysed from a formal point of view taking into account partial identities and differences among noun endings. Theme vowels being factored out, endings are classified on the basis of their make-up, e.g., as sigmatic endings; as containing desinential (non-thematic) vowels; as containing long vowels; and so on. The analysis leads to a view of endings as involving more basic elements or ‘markers’. Endings of the various declensions instantiate a small number of types, and these can be put into a ranked order (a formal scale) that applies transparadigmatically. Third, the relationship between the independently substantiated functional and formal hierarchies is examined. In any declension, the form-function-relationship is established by aligning the relevant formal and functional scales (or ‘sequences’). Some types of endings are in one-to-one correspondence with bundles of morphosyntactic properties as they should be according to a classical morphemic approach, but others are not. Nevertheless, endings can be assigned a uniform role if the form-function-relationship is understood to be based on an alignment of formal and functional sequences. A diagrammatical form-function relationship is revealed that could not be captured in classical or refined morphemic approaches.

On the similarity of tones of the organ stop vox humana to human vowels (2013)

Brackhane, Fabian ; Trouvain, Jürgen

In mechanical speech synthesis from the 18th up to the 20th century, reed pipes were mainly used for the generation of the voice and the organ stop vox humana was central in this process. This has been described in different historical documents which report that the vox humana in some organs sounded like human vowels. In this study, tones of four different voces humanae were recorded to investigate their similarity to human vowels. The acoustical and perceptual analysis revealed that some, though not all, tones show a high similarity to selected vowels.

Predicate Acquisition for Opinion Holder Extraction. A Data-Intensive Approach (2013)

Wiegand, Michael

Opinion holder extraction is one of the most important tasks in sentiment analysis. We will briefly outline the importance of predicates for this task and categorize them according to part of speech and according to which semantic role they select for the opinion holder. For many languages there do not exist semantic resources from which such predicates can be easily extracted. Therefore, we present alternative corpus-based methods to gain such predicates automatically, including the usage of prototypical opinion holders, i.e. common nouns, denoting for example experts or analysts, which describe particular groups of people whose profession or occupation is to form and express opinions towards specific items.

Predicative Adjectives: An Unsupervised Criterion to Extract Subjective Adjectives (2013)

Wiegand, Michael ; Ruppenhofer, Josef ; Klakow, Dietrich

We examine predicative adjectives as an unsupervised criterion to extract subjective adjectives. We do not only compare this criterion with a weakly supervised extraction method but also with gradable adjectives, i.e. another highly subjective subset of adjectives that can be extracted in an unsupervised fashion. In order to prove the robustness of this extraction method, we will evaluate the extraction with the help of two different state-of-the-art sentiment lexicons (as a gold standard).

Pseudoclefts in Hungarian (2013)

Hartmann, Jutta M. ; Hegedűs, Veronika ; Surányi, Balázs

Based on novel data from Hungarian, this paper makes the case that in at least some languages specificational pseudocleft sentences must receive a ‘what-you- see-is-what-you-get’ syntactic analysis. More specifically, it is argued that the clefted constituent is the subject of predication (underlyingly base-generated in Spec, Pr), whereas the cleft clause acts as a predicate in the structure. Alongside connectivity effects characteristic of specificational pseudoclefts, we also discuss a range of anti-connectivity effects, which we show to receive a straightforward explanation under the proposed analysis. It follows that attested connectivity effects, in turn, require a semantic, rather than a syntactic account, along the lines of Jacobson (1994) and Sharvit (1999).

Reanimating responibility. The weź-V2 (take-V2) double imperative in Polish (2013)

Zinken, Jörg

This study analyses the use of the Polish wez- V2 (take-V2) double imperative to request here-and-now actions. The analysis is based on a collection of approximately 40 take-V2 double imperatives, which was built from a corpus of 10 hours of video recordings of everyday interactions (preparing and having meals, playing with children, etc.) taking place in the homes of Polish families. A sequential analysis of these data shows that the take-V2 construction is commonly selected in situations where the request recipient could be expected to already be attending to the relevant business (e.g., because they committed to this earlier in the interaction), but isn’t. By selecting the take-V2 format, the request speaker reanimates the recipient´s responsibility for the matter at hand.

Reformulating place (2013)

Kitzinger, Celia ; Lerner, Gene H. ; Zinken, Jörg ; Wilkinson, Sue ; Kevoe-Feldmann, Heidi ; Ellis, Sonja

This report examines what can be accomplished in conversation by reformulating a reference to a place using the practices of repair. It is based on an analysis of a collection of place references situated in second pair parts of adjacency pairs taken from a wide range of field recordings of talk-in-interaction. Not surprisingly, place references are sometimes reformulated so as to indicate a misspeaking or in pursuit of recipient recognition. At other times, however, we show that place references can be reformulated to more adequately implement the action of a turn in prosecuting the course of action of which it is a part. In these cases repairing a place reference can target a source of trouble associated with implementing the action of a turn at talk, and thus reformulating place can serve as a practical resource for accomplishing a range of interactional tasks. We conclude with a more complex case in which two reformulations are deployed in responding to a so-called ‘double-barrelled’ initiating action.

Repairs for Reasoning (2013)

Schmitz, Hans-Christian ; Fisseni, Bernhard

We describe and experimentally investigate phenomena of modal enrichment, that is, phenomena in which a recipient non-literally interprets an utterance by creating and applying a modal operator. We give competing explanations for these phenomena - namely an explanation according to which modal enrichment is a repair procedure for making the utterance match a script of information processing vs. an explanation according to which modal enrichment is triggered by rhetorical structure.

Repairs. The added value of being wrong (2013)

Brandt, Patrick ; Fuß, Eric

Grammatische Strukturen verbinden Systeme des Denkens und Systeme des Sprechens und Zeigens, deren jeweilige Bedingungen kaum zueinander zu passen scheinen. Der Reparaturansatz betrachtet den regulären Umgang mit Übersetzungsproblemen innerhalb des grammatischen Systems und an seinen Schnittstellen als konstitutiv für Expressivität und Ökonomie der Sprache. Reparaturen sind produktive Wiedergutmachungs- und Anpassungsmechanismen, die linguistische Phänomene als Reflex der Kompensation für derivationelle oder interpretative Schäden erklären.

Representing human and machine dictionaries in markup languages (SGML, XML) (2013)

Lemnitzer, Lothar ; Romary, Laurent ; Witt, Andreas

Representing human and machine dictionaries in markup languages (SGML, XML) (2013)

Lemnitzer, Lothar ; Romary, Laurent ; Witt, Andreas

Responsibility and action. Invariants and diversity in object requests in Polish and British English interaction (2013)

Zinken, Jörg ; Ogiermann, Eva

The authors compare the use of two formats for requesting an object in informal everyday interaction: imperatives, common in our Polish data, and second-person polar questions, common in our English data. Imperatives and polar questions are selected in the same interactional “home environments” across the languages, in which they enact two social actions: drawing on shared responsibility and enlisting assistance, respectively. Speakers across the languages differ in their choice of request format in “mixed” interactional environments that support either. The finding shed light on the orderly ways in which cultural diversity is grounded in invariants of action formation.

Robust corpus architecture: a new look at virtual collections and data access (2013)

Bański, Piotr ; Frick, Elena ; Hanl, Michael ; Kupietz, Marc ; Schnober, Carsten ; Witt, Andreas

Semantic verb class (2013)

Proost, Kristel

Story Comparisons: Evidence from Film Reviews (2013)

Fisseni, Bernhard ; Kurji, Aadil ; Sarikaya, Deniz ; Viehstädt, Mira

Interested in formally modelling similarity between narratives, we investigate judgements of similarity between narratives in a small corpus of film reviews and book–film comparisons. A main finding is that judgements tend to concern multiple levels of story representation at once. As these texts are pragmatically related to reception contexts, we find many references to reception quality and optimality. We conclude that current formal models of narrative can not capture the task of naturalistic narrative comparisons given in the analysed reviews, but that the development of models containing a more reception-oriented point of view will be necessary.

STTS goes Kiez – Experiments on Annotating and Tagging Urban Youth Language (2013)

Rehbein, Ines ; Schalowski, Sören

Subjective impressions do not mirror online reading effort: concurrent EEG-Eyetracking evidence from the reading of books and digital media (2013)

Kretzschmar, Franziska ; Pleimling, Dominique ; Hosemann, Jana ; Füssel, Stephan ; Bornkessel-Schlesewsky, Ina ; Schlesewsky, Matthias

In the rapidly changing circumstances of our increasingly digital world, reading is also becoming an increasingly digital experience: electronic books (e-books) are now outselling print books in the United States and the United Kingdom. Nevertheless, many readers still view e-books as less readable than print books. The present study thus used combined EEG and eyetracking measures in order to test whether reading from digital media requires higher cognitive effort than reading conventional books. Young and elderly adults read short texts on three different reading devices: a paper page, an e-reader and a tablet computer and answered comprehension questions about them while their eye movements and EEG were recorded. The results of a debriefing questionnaire replicated previous findings in that participants overwhelmingly chose the paper page over the two electronic devices as their preferred reading medium. Online measures, by contrast, showed shorter mean fixation durations and lower EEG theta band voltage density – known to covary with memory encoding and retrieval – for the older adults when reading from a tablet computer in comparison to the other two devices. Young adults showed comparable fixation durations and theta activity for all three devices. Comprehension accuracy did not differ across the three media for either group. We argue that these results can be explained in terms of the better text discriminability (higher contrast) produced by the backlit display of the tablet computer. Contrast sensitivity decreases with age and degraded contrast conditions lead to longer reading times, thus supporting the conclusion that older readers may benefit particularly from the enhanced contrast of the tablet. Our findings thus indicate that people’s subjective evaluation of digital reading media must be dissociated from the cognitive and neural effort expended in online information processing while reading from such devices.

Textual structures in electronic dictionaries compared with printed dictionaries : a short general survey (2013)

Müller-Spitzer, Carolin

Textual structures in printed dictionaries are well known, adequately researched, and rather exhaustively described (cf. articles 3&10). This article investigates whether or not the models of textual structures in printed dictionaries can be applied to electronic dictionaries (EDs); or, more precisely, which parts of the order and terminology of textual structures in printed dictionaries are applicable to electronic ones and of which differences should one be aware. The focus will be on online dictionaries because they represent the most important kind of digital dictionary, and will become even more important in future. Furthermore, the emphasis will be more on potential future forms of online dictionaries than on current ones which are still sometimes produced as copies of their printed counterparts. To approach this question, basic differences between textual structures in electronic versus printed dictionaries will firstly be discussed. Secondly, further terminological and formal preliminary remarks will be made. The main part of the article will then follow to adapt de Schryver’s idea of “Creating order in dreamland” expressed in his article “Lexicographer’s dreams in the electronic dictionary age”. The aim here is to begin ‘create order in terminology land’ for textual structures in electronic dictionaries. A definitive order cannot be given here because electronic lexicography today involves constant change. In order to discuss the order of textual structures in EDs, not only theoretically, but also in concrete terms, their basic properties will be illustrated by means of a notional online dictionary. Following on from this fictitious scenario, a provisional survey of textual structures in EDs will be presented. Thereby, the focus is less on current online dictionaries than on the possibilities which the new medium provides. Finally, an explanation will be given as to how this view of structures in electronic dictionaries is useful for analyzing current EDs and for planning new ones. The overall aim here is not to introduce new kinds of textual structure in EDs and a corresponding terminology in detail, but to point out some constitutive differences between textual structures in printed dictionaries and those in electronic dictionaries.

The cognitive paradigm in linguistics and media discourse typology (2013)

Mendzheritskaya, Elena O.

Discourse analysis in general, and media discourse analysis in particular, are currently attracting increased attention from linguists. This interest can be seen in the tendency to apply the term ‘discourse’ to various sciences and academic disciplines. It is possible to trace its dispersion both horizontally, i.e. in different sciences, and vertically, i.e. on various linguistic levels. Furthermore, the majority of interpretations of the term ‘discourse’ appearing in the works of modern scholars have arisen as a result of the interdisciplinary nature of language study within the cognitive paradigm in linguistics.

The FrameNet approach to relating syntax and semantics (2013)

Ruppenhofer, Josef ; Boas, Hans Christian ; Baker, Collin F.

The lexicographical process (with special focus on online dictionaries) (2013)

Klosa, Annette

Towards Contextual Healthiness Classification of Food Items - A Linguistic Approach (2013)

Wiegand, Michael ; Klakow, Dietrich

We explore the feasibility of contextual healthiness classification of food items. We present a detailed analysis of the linguistic phenomena that need to be taken into consideration for this task based on a specially annotated corpus extracted from web forum entries. For automatic classification, we compare a supervised classifier and rule-based classification. Beyond linguistically motivated features that include sentiment information we also consider the prior healthiness of food items.

Towards the Detection of Reliable Food-Health Relationships (2013)

Wiegand, Michael ; Klakow, Dietrich

We investigate the task of detecting reliable statements about food-health relationships from natural language texts. For that purpose, we created a specially annotated web corpus from forum entries discussing the healthiness of certain food items. We examine a set of task-specific features (mostly) based on linguistic insights that are instrumental in finding utterances that are commonly perceived as reliable. These features are incorporated in a supervised classifier and compared against standard features that are widely used for various tasks in natural language processing, such as bag of words, part-of speech and syntactic parse information.

Towards Weakly Supervised Resolution of Null Instantiations (2013)

Gorinski, Philip ; Ruppenhofer, Josef ; Sporleder, Caroline

This paper addresses the task of finding antecedents for locally uninstantiated arguments. To resolve such null instantiations, we develop a weakly supervised approach that investigates and combines a number of linguistically motivated strategies that are inspired by work on semantic role labeling and corefence resolution. The performance of the system is competitive with the current state-of-the-art supervised system.

Turn-design at turn-beginnings : multimodal resources to deal with tasks of turn-construction in German (2013)

Deppermann, Arnulf

Based on German speaking data from various activity types, the range of multimodal resources used to construct turn-beginnings is reviewed. It is claimed that participants in talk-in-interaction need to deal with four tasks in order to construct a turn which precisely fits the interactional moment of its production: 1. Achieve joint orientation: The accomplishment of the socio-spatial prerequisites necessary for producing a turn which is to become part of the participants’ common ground. 2. Display uptake: Next speaker needs to display his/her understanding of the interaction so far as the backdrop on which the production of the upcoming turn is based. 3. Deal with projections from prior talk: The speaker has to deal with projections which have been established by (the) previous turn(s) with respect to the upcoming turn. 4. Project properties of turn-in-progress: The speaker needs to orient the recipient to properties of the turn s/he is about to produce. Turn-design thus can be seen to be informed by tasks related to the multimodal, embodied, and interactive contingencies of online-construction of turns. The four tasks are ordered in terms of prior tasks providing the prerequisite for accomplishing a later task.

Turn-design at turn-beginnings: Multimodal resources to deal with tasks of turn-construction in German (2013)

Deppermann, Arnulf

Based on German speaking data from various activity types, the range of multimodal resources used to construct turn-beginnings is reviewed. It is claimed that participants in talk-in-interaction need to deal with four tasks in order to construct a turn which precisely fits the interactional moment of its production: 1. Achieve joint orientation: The accomplishment of the socio-spatial prerequisites necessary for producing a turn which is to become part of the participants’ common ground. 2. Display uptake: Next speaker needs to display his/her understanding of the interaction so far as the backdrop on which the production of the upcoming turn is based. 3. Deal with projections from prior talk: The speaker has to deal with projections which have been established by (the) previous turn(s) with respect to the upcoming turn. 4. Project properties of turn-in-progress: The speaker needs to orient the recipient to properties of the turn s/he is about to produce. Turn-design thus can be seen to be informed by tasks related to the multimodal, embodied, and interactive contingencies of online-construction of turns. The four tasks are ordered in terms of prior tasks providing the prerequisite for accomplishing a later task.

Using generalized additive models and random forests to model prosodic prominence in German (2013)

Arnold, Denis ; Wagner, Petra ; Baayen, R. Harald

The perception of prosodic prominence is influenced by different sources like different acoustic cues, linguistic expectations and context. We use a generalized additive model and a random forest to model the perceived prominence on a corpus of spoken German. Both models are able to explain over 80% of the variance. While the random forests give us some insights on the relative importance of the cues, the general additive model gives us insights on the interaction between different cues to prominence.

Verb group (2013)

Proost, Kristel

Verb preposition combination (2013)

Proost, Kristel

Verbless construction. Construction which does not contain any verb. Verblose Konstruktion. Eine Konstruktion, die kein Verb enthält (2013)

Proost, Kristel

Word frequency, vowel length and vowel quality in speech production: An EMA study of the importance of experience (2013)

Tomaschek, Fabian ; Wieling, Martijn ; Arnold, Denis ; Baayen, R. Harald

A frequently replicated finding is that higher frequency words tend to be shorter and contain more strongly reduced vowels. However, little is known about potential differences in the articulatory gestures for high vs. low frequency words. The present study made use of electromagnetic articulography to investigate the production of two German vowels, [i] and [a], embedded in high and low frequency words. We found that word frequency differently affected the production of [i] and [a] at the temporal as well as the gestural level. Higher frequency of use predicted greater acoustic durations for long vowels; reduced durations for short vowels; articulatory trajectories with greater tongue height for [i] and more pronounced downward articulatory trajectories for [a]. These results show that the phonological contrast between short and long vowels is learned better with experience, and challenge both the Smooth Signal Redundancy Hypothesis and current theories of German phonology.

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

70 search hits