Refine
Year of publication
Document Type
- Article (18)
- Part of a Book (5)
- Conference Proceeding (4)
- Book (1)
Has Fulltext
- yes (28) (remove)
Keywords
- Deutsch (5)
- Interaktion (5)
- Korpus <Linguistik> (4)
- Automatische Sprachanalyse (3)
- Computerlinguistik (3)
- Gesprochene Sprache (3)
- Multimodalität (3)
- Natürliche Sprache (3)
- Theater (3)
- Annotation (2)
Publicationstate
- Postprint (14)
- Veröffentlichungsversion (12)
- Zweitveröffentlichung (9)
Reviewstate
- Peer-Review (28) (remove)
Publisher
- Springer (28) (remove)
The ISOcat registry reloaded
(2012)
The linguistics community is building a metadata-based infrastructure for the description of its research data and tools. At its core is the ISOcat registry, a collaborative platform to hold a (to be standardized) set of data categories (i.e., field descriptors). Descriptors have definitions in natural language and little explicit interrelations. With the registry growing to many hundred entries, authored by many, it is becoming increasingly apparent that the rather informal definitions and their glossary-like design make it hard for users to grasp, exploit and manage the registry’s content. In this paper, we take a large subset of the ISOcat term set and reconstruct from it a tree structure following the footsteps of schema.org. Our ontological re-engineering yields a representation that gives users a hierarchical view of linguistic, metadata-related terminology. The new representation adds to the precision of all definitions by making explicit information which is only implicitly given in the ISOcat registry. It also helps uncovering and addressing potential inconsistencies in term definitions as well as gaps and redundancies in the overall ISOcat term set. The new representation can serve as a complement to the existing ISOcat model, providing additional support for authors and users in browsing, (re-)using, maintaining, and further extending the community’s terminological metadata repertoire.
Neologisms, i.e., new words or meanings, are finding their way into everyday language use all the time. In the process, already existing elements of a language are recombined or linguistic material from other languages is borrowed. But are borrowed neologisms accepted similarly well by the speech community as neologisms that were formed from “native” material? We investigate this question based on neologisms in German. Building on the corresponding results of a corpus study, we test the hypothesis of whether “native” neologisms are more readily accepted than those borrowed from English. To do so, we use a psycholinguistic experimental paradigm that allows us to estimate the degree of uncertainty of the participants based on the mouse trajectories of their responses. Unexpectedly, our results suggest that the neologisms borrowed from English are accepted more frequently, more quickly, and more easily than the “native” ones. These effects, however, are restricted to people born after 1980, the so-called millenials. We propose potential explanations for this mismatch between corpus results and experimental data and argue, among other things, for a reinterpretation of previous corpus studies.
We present an approach for modeling German negation in open-domain fine grained sentiment analysis. Unlike most previous work in sentiment analysis, we assume that negation can be conveyed by many lexical units (and not only common negation words) and that different negation words have different scopes. Our approach is examined on a new dataset comprising sentences with mentions of polar expressions and various negation words. We identify different types of negation words that have the same scopes. We show that already negation modeling based on these types largely outperforms traditional negation models which assume the same scope for all negation words and which employ a window-based scope detection rather than a scope detection based on syntactic information.
In this article, we examine the effectiveness of bootstrapping supervised machine-learning polarity classifiers with the help of a domain-independent rule-based classifier that relies on a lexical resource, i.e., a polarity lexicon and a set of linguistic rules. The benefit of this method is that though no labeled training data are required, it allows a classifier to capture in-domain knowledge by training a supervised classifier with in-domain features, such as bag of words, on instances labeled by a rule-based classifier. Thus, this approach can be considered as a simple and effective method for domain adaptation. Among the list of components of this approach, we investigate how important the quality of the rule-based classifier is and what features are useful for the supervised classifier. In particular, the former addresses the issue in how far linguistic modeling is relevant for this task. We not only examine how this method performs under more difficult settings in which classes are not balanced and mixed reviews are included in the data set but also compare how this linguistically-driven method relates to state-of-the-art statistical domain adaptation.
In this article, we explore the feasibility of extracting suitable and unsuitable food items for particular health conditions from natural language text. We refer to this task as conditional healthiness classification. For that purpose, we annotate a corpus extracted from forum entries of a food-related website. We identify different relation types that hold between food items and health conditions going beyond a binary distinction of suitability and unsuitability and devise various supervised classifiers using different types of features. We examine the impact of different task-specific resources, such as a healthiness lexicon that lists the healthiness status of a food item and a sentiment lexicon. Moreover, we also consider task-specific linguistic features that disambiguate a context in which mentions of a food item and a health condition co-occur and compare them with standard features using bag of words, part-of-speech information and syntactic parses. We also investigate in how far individual food items and health conditions correlate with specific relation types and try to harness this information for classification.
Just like most varieties of West Germanic, virtually all varieties of German use a construction in which a cognate of the English verb 'do' (standard German 'tun') functions as an auxiliary and selects another verb in the bare infinitive, a construction known as 'do'-periphrasis or 'do'-support. The present paper provides an Optimality Theoretic (OT) analysis of this phenomenon. It builds on a previous analysis by Bader and Schmid (An OT-analysis of 'do'-support in Modern German, 2006) but (i) extends it from root clauses to subordinate clauses and (ii) aims to capture all of the major distributional patterns found across (mostly non-standard) varieties of German. In so doing, the data are used as a testing ground for different models of German clause structure. At first sight, the occurrence of 'do' in subordinate clauses, as found in many varieties, appears to support the standard CP-IP-VP analysis of German. In actual fact, however, the full range of data turn out to challenge, rather than support, this model. Instead, I propose an analysis within the IP-less model by Haider (Deutsche Syntax - generativ. Vorstudien zur Theorie einer projektiven Grammatik, Narr, Tübingen, 1993 et seq.). In sum, the 'do'-support data will be shown to have implications not only for the analysis of clause structure but also for the OT constraints commonly assumed to govern the distribution of 'do', for the theory of non-projecting words (Toivonen in Non-projecting words, Kluwer, Dordrecht, 2003) as well as research on grammaticalization.
Large classes at universities(> 1600 students) create their own challenges for teaching and learning. Audience feedback is lacking and fine tuning of lectures, courses and exam preparation to address individual needs is very difficult to achieve. At RWTH Aachen University, a course concept and a knowledge map learning tool aimed to support individual students to prepare for exams in information science through theme-based exercises were developed and evaluated. The tool was grounded in the notion of self-regul ated learning with the goal of enabling students to learn
independently.
The transfer of research data management from one institution to another infrastructural partner is all but trivial, but can be required, for instance, when an institution faces reorganization or closure. In a case study, we describe the migration of all research data, identify the challenges we encountered, and discuss how we addressed them. It shows that the moving of research data management to another institution is a feasible, but potentially costly enterprise. Being able to demonstrate the feasibility of research data migration supports the stance of data archives that users can expect high levels of trust and reliability when it comes to data safety and sustainability.
German subjectively veridical sicher sein ‘be certain’ can embed ob-clauses in negative contexts, while subjectively veridical glauben ‘believe’ and nonveridical möglich sein ‘be possible’ cannot. The Logical Form of F isn’t certain if M is in Rome is regarded as the negated disjunction of two sentences ¬(cf σ ∨ cf ¬σ) or ¬cf σ ∧ ¬cf ¬σ. Be certain can have this LF because ¬cf σ and ¬cf ¬σ are compatible and nonveridical. Believe excludes this LF because ¬bf σ and ¬bf ¬σ are incompatible in a question-under-discussion context. It follows from this incompatibility and from the incompatibility of bf σ and bf ¬σ that bf ¬σ and ¬bf σ are equivalent. Therefore believe cannot be nonveridical. Be possible doesn’t allow the LF either. Similar to believe, ¬pf σ and ¬pf ¬σ are incompatible. But unlike believe, pf σ and pf ¬σ are compatible.
Wir diskutieren in diesem Beitrag Implikationen, mit denen man zu tun bekommt, wenn man kleinste Formen situativer Vergesellschaftung – wir sprechen von kommunikativen Minimalformen – untersucht. Kommunikative Minimalformen sind kurzzeitige, nur wenige Sekunden dauernde, gemeinsam konstituierte Interaktionsereignisse. Ungeachtet ihrer Kürze weisen sie zum einen eine komplexe Interaktionsstruktur auf. Zum anderen besitzen sie auch eine klare soziale Implikation und eigene Wertigkeit. In dem hier untersuchten Fall, bei dem Passanten durch ein offenes Fenster in einen Privatraum blicken und dabei ertappt werden, zeigt sich diese soziale Implikativität als moralische Kommunikation im Sinne der interaktiven Bearbeitung eigenen Fehlverhaltens.