Refine
Year of publication
- 2014 (96) (remove)
Document Type
- Conference Proceeding (45)
- Part of a Book (28)
- Article (16)
- Book (6)
- Preprint (1)
Language
- English (96) (remove)
Keywords
- Korpus <Linguistik> (28)
- Deutsch (18)
- Computerlinguistik (10)
- Gesprochene Sprache (7)
- Konversationsanalyse (7)
- Natürliche Sprache (6)
- Annotation (5)
- Englisch (5)
- German (5)
- Information Extraction (5)
Publicationstate
- Veröffentlichungsversion (32)
- Postprint (10)
- Zweitveröffentlichung (6)
Reviewstate
- Peer-Review (26)
- (Verlags)-Lektorat (14)
- Verlags-Lektorat (2)
- Peer-Revied (1)
- Peer-review (1)
Publisher
- European Language Resources Association (ELRA) (10)
- Benjamins (6)
- De Gruyter (6)
- Cambridge Scholars Publ. (5)
- Universitätsverlag Hildesheim (5)
- de Gruyter (4)
- International Speech Communication Association (3)
- Sage (3)
- ELRA (2)
- EURALEX (2)
Speakers’ dialogical orientation to the particular others they talk to is implemented by practices of recipient-design. One such practice is the use of negation as a means to constrain interpretations of speaker’s actions by the partner. The paper situates this use of negation within the larger context of other recipient-designed uses of negation which negate assumptions the speaker makes about what the addressee holds to be true (second-order assumptions) or what the addressee assumes the speaker holds to be true (third- order assumptions). The focus of the study is on the ways in which speakers use negation to disclaim interpretations of their turns which partners have displayed or may possibly arrive at. Special emphasis is given to the positionally sensitive uses of negation, which may occur before, after or inserted between the nucleus actions whose interpretation is constrained by the negation. Interactional motivations and rhetorical potentials of the practice are pointed out, partly depending on the position of the negation vis-à-vis the nucleus action. The analysis shows that the concept of ‘recipient design’ is in need of distinctions which have not been in focus in prior research.
So far, there have been few descriptions on creating structures capable of storing lexicographic data, ISO 24613:2008 being one of the latest. Another one is by Spohr (2012), who designs a multifunctional lexical resource which is able to store data of different types of dictionaries in a user-oriented way. Technically, his design is based on the principle of a hierarchical XML/OWL (eXtensible Markup Language/Web Ontology Language) representation model. This article follows another route in describing a model based on entities and relations between them; MySQL (usually referred to as: Structured Query Language) describes a database system of tables containing data and definitions of relations between them. The model was developed in the context of the project "Scientific eLexicography for Africa" and the lexicographic database to be built thereof will be implemented with MySQL. The principles of the ISO model and of Spohr's model are adhered to with one major difference in the implementation strategy: we do not place the lemma in the centre of attention, but the sense description — all other elements, including the lemma, depend on the sense description. This article also describes the contained lexicographic data sets and how they have been collected from different sources. As our aim is to compile several prototypical internet dictionaries (a monolingual Northern Sotho dictionary, a bilingual learners' Xhosa–English dictionary and a bilingual Zulu–English dictionary), we describe the necessary microstructural elements for each of them and which principles we adhere to when designing different ways of accessing them. We plan to make the model and the (empty) database with all graphical user interfaces that have been developed, freely available by mid-2015.
Large classes at universities(> 1600 students) create their own challenges for teaching and learning. Audience feedback is lacking and fine tuning of lectures, courses and exam preparation to address individual needs is very difficult to achieve. At RWTH Aachen University, a course concept and a knowledge map learning tool aimed to support individual students to prepare for exams in information science through theme-based exercises were developed and evaluated. The tool was grounded in the notion of self-regul ated learning with the goal of enabling students to learn
independently.
We present an approach to an aspect of managing complex access scenarios to large and heterogeneous corpora that involves handling user queries that, intentionally or due to the complexity of the queried resource, target texts or annotations outside of the given user’s permissions. We first outline the overall architecture of the corpus analysis platform KorAP, devoting some attention to the way in which it handles multiple query languages, by implementing ISO CQLF (Corpus Query Lingua Franca), which in turn constitutes a component crucial for the functionality discussed here. Next, we look at query rewriting as it is used by KorAP and zoom in on one kind of this procedure, namely the rewriting of queries that is forced by data access restrictions.
This paper introduces the Aix Map Task corpus, a corpus of audio and video recordings of task-oriented dialogues. It was modelled after the original HCRC Map Task corpus. Lexical material was designed for the analysis of speech and prosody, as described in Astésano et al. (2007). The design of the lexical material, the protocol and some basic quantitative features of the existing corpus are presented. The corpus was collected under two communicative conditions, one audio-only condition and one face-to-face condition. The recordings took place in a studio and a sound attenuated booth respectively, with head-set microphones (and in the face-to-face condition with two video cameras). The recordings have been segmented into Inter-Pausal-Units and transcribed using transcription conventions containing actual productions and canonical forms of what was said. It is made publicly available online.
This contribution presents an XML Schema for annotating a high level narratological category: speech, thought and writing representation (ST&WR). It focusses on two aspects: Firstly, the original Schema is presented as an example for the challenge to encode a narrative feature in a structured and flexible way and secondly, ways of adapting this Schema to TEI are considered, in Order to make it usable for other, TEI-based projects.
Annotating Spoken Language
(2014)
We continue the study of the reproducibility of Propp’s annotations from Bod et al. (2012). We present four experiments in which test subjects were taught Propp’s annotation system; we conclude that Propp’s system needs a significant amount of training, but that with sufficient time investment, it can be reliably trained for simple tales.
Automatic Food Categorization from Large Unlabeled Corpora and Its Impact on Relation Extraction
(2014)
We present a weakly-supervised induction method to assign semantic information to food items. We consider two tasks of categorizations being food-type classification and the distinction of whether a food item is composite or not. The categorizations are induced by a graph-based algorithm applied on a large unlabeled domain-specific corpus. We show that the usage of a domain-specific corpus is vital. We do not only outperform a manually designed open-domain ontology but also prove the usefulness of these categorizations in relation extraction, outperforming state-of-the-art features that include syntactic information and Brown clustering.
Feminine forms of job titles raise great interest in many countries. However, it is still unknown how they shape stereotypical impressions on warmth and competence dimensions among female and male listeners. In an experiment with fictitious job titles men perceived women described with feminine job titles as significantly less warm and marginally less competent than women with masculine job titles, which led to lower willingness to employ them. No such effects were observed among women.
By way of migration, large numbers of German-speaking settlers arrived in Pennsylvania between roughly 1700 and 1750. Pennsylvania German, as a distinct variety, developed through levelling processes from L1 varieties of these migrants who came mainly from the southwestern regions of the German speaking area. Pennsylvania German is still spoken today by specific religious groups (primarily Amish and Menonnite groups) for many of whom it is an identity marker. My paper focuses on those Pennsylvania Germans who are not part of these religious groups but have the same migration history. Due to their being closer to the cultural values of American mainstream society, they were integrated into it, and during the 20th century their use of Pennsylvania German was continually diminishing. A revival of this heritage language has occurred over the past c. three decades, including language courses offered at community colleges, public libraries, etc., where ethnic Pennsylvania Germans wish to (re-)learn the language of their grandparents. Written Pennsylvania German data from four points in time between the 1860s and the 1990s were analysed in this study. Based on these linguistic analyses, differences between the data sets are shown that point towards a diachronic change in the language contact situation of Pennsylvania German speakers. Sociolinguistic and extralinguistic factors are considered that influence the role of PG and make their speakers heritage speakers much in the sense of recent immigrant heritage speakers, although delayed by 200 years.
We discovered several recurring errors in the current version of the Europarl Corpus originating both from the web site of the European Parliament and the corpus compilation based thereon. The most frequent error was incompletely extracted metadata leaving non-textual fragments within the textual parts of the corpus files. This is, on average, the case for every second speaker change. We not only cleaned the Europarl Corpus by correcting several kinds of errors, but also aligned the speakers’ contributions of all available languages and compiled every- thing into a new XML-structured corpus. This facilitates a more sophisticated selection of data, e.g. querying the corpus for speeches by speakers of a particular political group or in particular language combinations.