Refine
Year of publication
- 2017 (67) (remove)
Document Type
- Conference Proceeding (34)
- Article (18)
- Part of a Book (10)
- Book (2)
- Working Paper (2)
- Other (1)
Language
- English (67) (remove)
Keywords
- Korpus <Linguistik> (29)
- Deutsch (12)
- Corpus linguistics (11)
- Computerlinguistik (6)
- Corpus technology (6)
- Texttechnologie (6)
- Annotation (5)
- Augenfolgebewegung (4)
- Blickbewegung (4)
- Datenmanagement (4)
Publicationstate
- Veröffentlichungsversion (67) (remove)
Reviewstate
- Peer-Review (57)
- Peer-review (5)
- (Verlags)-Lektorat (2)
Publisher
- Institut für Deutsche Sprache (12)
- de Gruyter (6)
- Lexical Computing CZ s.r.o. (5)
- The Association for Computational Linguistics (5)
- Leibniz-Institut für Deutsche Sprache (IDS) (2)
- Linguistic Society of Papua New Guinea (2)
- Linköping University Electronic Press (2)
- McGill University & Université de Montréal (2)
- Asian Federation of Natural Language Processing (1)
- Association for Computational Linguistics (1)
In this paper, we discuss to what extent the German-based contact language Unserdeutsch (Rabaul Creole German, cf. Volker 1982) matches the category‘creole language’ from both a socio-historical and structural perspective. As a point of reference, we will use typological criteria that are widely supposed to be typical for creole languages. It is shown that Unserdeutsch fits fairly well into the pattern of an ‘average creole’, as has been suggested by data in the Atlas of Pidgin and Creole Language Structures (Michaelis et al. 2013). This is despite a series of atypical conditions in its development that might lead us to expect a close structural proximity to the lexifier language, i.e. a relatively acrolectal creole. A possible explanation for this striking discrepancy can be found in the primary function of Unserdeutsch as a marker of identity as well as in the linguistic structure of its substrate language Tok Pisin.
This paper provides insights into the ongoing international research project Unserdeutsch (Rabaul Creole German): Documentation of a highly endangered creole language in Papua New Guinea, based at the University of Augsburg, Germany. It elaborates on the different stages of the project, ranging from fieldwork to corpus development, thereby outlining the methods and software background used for the intended purposes. In doing so, we also give some approaches to solving specific problems, which have arisen in the course of practical work until now.
This paper reports about current practice in a staged approach to the introduction of NLP principles and techniques for students of information science (IIM) and of international communication and translation (ICT) as part of their curricula. As most of these students are rather not familiar with computer science or, in the case of IIM students, linguistics, we see them as comparable with students of the humanities. We follow a blended learning strategy with lectures, online materials, tutorials, and screencasts. In the first two terms, we focus on linguistics and its formalisation, NLP tools and applications are then introduced from the third term on. The lectures are combined with tutorials and - since the summer term 2017 - with a set of screencasts.
Data sets of publication meta data with manually disambiguated author names play an important role in current author name disambiguation (AND) research. We review the most important data sets used so far, and compare their respective advantages and shortcomings. From the results of this review, we derive a set of general requirements to future AND data sets. These include both trivial requirements, like absence of errors and preservation of author order, and more substantial ones, like full disambiguation and adequate representation of publications with a small number of authors and highly variable author names. On the basis of these requirements, we create and make publicly available a new AND data set, SCAD-zbMATH. Both the quantitative analysis of this data set and the results of our initial AND experiments with a naive baseline algorithm show the SCAD-zbMATH data set to be considerably different from existing ones. We consider it a useful new resource that will challenge the state of the art in AND and benefit the AND research community.
In conversation, turn-taking is usually fluid, with next speakers taking their turn right after the end of the previous turn. Most, but not all, previous studies show that next speakers start to plan their turn early, if possible already during the incoming turn. The present study makes use of the list-completion paradigm (Barthel et al., 2016), analyzing speech onset latencies and eye-movements of participants in a task-oriented dialogue with a confederate. The measures are used to disentangle the contributions to the timing of turn-taking of early planning of content on the one hand and initiation of articulation as a reaction to the upcoming turn-end on the other hand. Participants named objects visible on their computer screen in response to utterances that did, or did not, contain lexical and prosodic cues to the end of the incoming turn. In the presence of an early lexical cue, participants showed earlier gaze shifts toward the target objects and responded faster than in its absence, whereas the presence of a late intonational cue only led to faster response times and did not affect the timing of participants' eye movements. The results show that with a combination of eye-movement and turn-transition time measures it is possible to tease apart the effects of early planning and response initiation on turn timing. They are consistent with models of turn-taking that assume that next speakers (a) start planning their response as soon as the incoming turn's message can be understood and (b) monitor the incoming turn for cues to turn-completion so as to initiate their response when turn-transition becomes relevant.
We present an event-related potentials (ERP) study that addresses the question of how pieces of information pertaining to semantic roles and event structure interact with each other and with the verb’s meaning. Specifically, our study investigates German verb-final clauses with verbs of motion such as fliegen ‘fly’ and schweben ‘float, hover,’ which are indeterminate with respect to agentivity and event structure. Agentivity was tested by manipulating the animacy of the subject noun phrase and event structure by selecting a goal adverbial, which makes the event telic, or a locative adverbial, which leads to an atelic reading. On the clause-initial subject, inanimates evoked an N400 effect vis-à-vis animates. On the adverbial phrase in the atelic (locative) condition, inanimates showed an N400 in comparison to animates. The telic (goal) condition exhibited a similar amplitude like the inanimate-atelic condition. Finally, at the verbal lexeme, the inanimate condition elicited an N400 effect against the animate condition in the telic (goal) contexts. In the atelic (locative) condition, items with animates evoked an N400 effect compared to inanimates. The combined set of findings suggest that clause-initial animacy is not sufficient for agent identification in German, which seems to be completed only at the verbal lexeme in our experiment. Here non-agents (inanimates) changing their location in a goal-directed way and agents (animates) lacking this property are dispreferred and this challenges the assumption that change of (locational) state is generally a defining characteristic of the patient role. Besides this main finding that sheds new light on role prototypicality, our data seem to indicate effects that, in our view, are related to complexity, i.e., minimality. Inanimate subjects or goal arguments increase processing costs since they have role or event structure restrictions that animate subjects or locative modifiers lack.