Refine
Year of publication
- 2018 (152) (remove)
Document Type
- Article (71)
- Part of a Book (46)
- Conference Proceeding (21)
- Review (7)
- Book (4)
- Part of Periodical (2)
- Periodical (1)
Keywords
- Deutsch (49)
- Korpus <Linguistik> (29)
- Konversationsanalyse (16)
- Gesprochene Sprache (13)
- Interaktion (13)
- Multimodalität (11)
- Grammatik (10)
- Computerlinguistik (9)
- Interaktionsanalyse (9)
- conversation analysis (8)
Publicationstate
- Veröffentlichungsversion (90)
- Zweitveröffentlichung (50)
- Postprint (27)
Reviewstate
- Peer-Review (152) (remove)
Publisher
- de Gruyter (16)
- European language resources association (ELRA) (13)
- Erich Schmidt (11)
- Verlag für Gesprächsforschung (8)
- Znanstvena založba Filozofske fakultete Univerze v Ljubljani / Ljubljana University Press, Faculty of Arts (7)
- Heidelberg University Publishing (5)
- Springer (5)
- Association for Computational Linguistics (4)
- Institut für Deutsche Sprache (4)
- Cambridge University Press (3)
Am Beispiel der polyfunktionalen Mehrworteinheit <was weiß ich> wird das Zusammenspiel von pragmatischer und phonetischer Ausdifferenzierung in Pragmatikalisierungsprozessen untersucht. Hierzu werden spontan-sprachliche Belege aus dem Korpus „Deutsch heute“ analysiert. Die beobachtete phonetische Variationsbreite deutet auf eine komplexe Beziehung zu den jeweiligen pragmatischen Funktionen hin.
This paper investigates the conditions that govern the choice between the German neuter singular relative pronouns das ‘that’ and was ‘what’. We show that das requires a lexical head noun, while in all other cases was is usually the preferred option; therefore, the distribution of das and was is most successfully captured by an approach that does not treat was as an exception but analyzes it as the elsewhere case that applies when the relativizer fails to pick up a lexical gender feature from the head noun. We furthermore show how the non-uniform behavior of different types of nominalized adjectives (positives allow both options, while superlatives trigger was) can be attributed to semantic differences rooted in syntactic structure. In particular, we argue that superlatives select was due to the presence of a silent counterpart of the quantifier alles ‘all’ that is part of the superlative structure.
This paper argues that there is a correlation between functional and purely grammatical patterning in language, yet the nature of this correlation has to be explored. This claim is based on the results of a corpus-driven study of the Slavic aspect, drawing on the socalled Distributional Hypothesis. According to the East-West Theory of the Slavic aspect, there is a broad east-west isogloss dividing the Slavic languages into an eastern group and a western group. There are also two transitional zones in the north and south, which share some properties with each group (Dickey 2000; Barentsen 1998, 2008). The East-West Theory uses concepts of cognitive grammar such as totality and temporal definiteness, and is based on various parameters of aspectual usage in discourse, including contexts such as habituals, general factuals, historical (narrative) present, performatives, sequenced events in the past etc. The purpose of the above-mentioned study is to challenge the semantic approach to the Slavic aspect by comparing the perfective and imperfective verbal aspect on the basis of purely grammatical co-occurrence patterns (see also Janda & Lyashevskaya 2011). The study focused on three Slavic languages: Russian, which, following the East-West Theory, belongs to the eastern group, Czech, which belongs to the western group, and Polish, which is considered as transitional in its aspectual patterning.
We present a testsuite for POS tagging German web data. Our testsuite provides the original raw text as well as the gold tokenisations and is annotated for parts-of-speech. The testsuite includes a new dataset for German tweets, with a current size of 3,940 tokens. To increase the size of the data, we harmonised the annotations in already existing web corpora, based on the Stuttgart-Tübingen Tag Set. The current version of the corpus has an overall size of 48,344 tokens of web data, around half of it from Twitter. We also present experiments, showing how different experimental setups (training set size, additional out-of-domain training data, self-training) influence the accuracy of the taggers. All resources and models will be made publicly available to the research community.
In a number of languages, agreement in specificational copular sentences can or must be with the second of the two nominals, even when it is the first that occupies the canonical subject position. Béjar & Kahnemuyipour (2017) show that Persian and Eastern Armenian are two such languages. They then argue that ‘NP2 agreement’ occurs because the nominal in subject position (NP1) is not accessible to an external probe. It follows that actual agreement with NP1 should never be possible: the alternative to NP2 agreement should be ‘default’ agreement. We show that this prediction is false. In addition to showing that English has NP1, not default, agreement, we present new data from Icelandic, a language with rich agreement morphology, including cases that involve ‘plurale tantum’ nominals as NP1. These allow us to control for any confound from the fact that typically in a specificational sentence with two nominals differing in number, it is NP2 that is plural. We show that even in this case, the alternative to agreement with NP2 is agreement with NP1, not a default. Hence, we conclude that whatever the correct analysis of specificational sentences turns out to be, it must not predict obligatory failure of NP1 agreement.
We present a study on gaps in spoken language interaction as a potential candidate for syntactic boundaries. On the basis of an online annotation experiment, we can show that there is an effect of gap duration and gap type on its likelihood of being a syntactic boundary. We discuss the potential of these findings for an automation of the segmentation process.
A syntax-based scheme for the annotation and segmentation of German spoken language interactions
(2018)
Unlike corpora of written language where segmentation can mainly be derived from orthographic punctuation marks, the basis for segmenting spoken language corpora is not predetermined by the primary data, but rather has to be established by the corpus compilers. This impedes consistent querying and visualization of such data. Several ways of segmenting have been proposed,
some of which are based on syntax. In this study, we developed and evaluated annotation and segmentation guidelines in reference to the topological field model for German. We can show that these guidelines are used consistently across annotators. We also investigated the influence of various interactional settings with a rather simple measure, the word-count per segment and unit-type. We observed that the word count and the distribution of each unit type differ in varying interactional settings and that our developed segmentation and annotation guidelines are used consistently across annotators. In conclusion, our syntax-based segmentations reflect interactional properties that are intrinsic to the social interactions that participants are involved in. This can be used for further analysis of social interaction and opens the possibility for automatic segmentation of transcripts.
This paper aims to describe different patterns of syntactic extensions of turns-at-talk in mundane conversations in Czech. Within interactional linguistics, same-speaker continuations of possibly complete syntactic structures have been described for typologically diverse languages, but have not yet been investigated for Slavic languages. Based on previously established descriptions of various types of extensions (Vorreiter 2003; Couper-Kuhlen & Ono 2007), our initial description shall therefore contribute to the cross-linguistic exploration of this phenomenon. While all previously described forms for continuing a turn-constructional unit seem to exist in Czech, some grammatical features of this language (especially free word order and strong case morphology) may lead to problems in distinguishing specific types of syntactic extensions. Consequently, this type of language allows for critically evaluating the cross-linguistic validity of the different categories and underlines the necessity of analysing syntactic phenomena within their specific action contexts.
Just like most varieties of West Germanic, virtually all varieties of German use a construction in which a cognate of the English verb 'do' (standard German 'tun') functions as an auxiliary and selects another verb in the bare infinitive, a construction known as 'do'-periphrasis or 'do'-support. The present paper provides an Optimality Theoretic (OT) analysis of this phenomenon. It builds on a previous analysis by Bader and Schmid (An OT-analysis of 'do'-support in Modern German, 2006) but (i) extends it from root clauses to subordinate clauses and (ii) aims to capture all of the major distributional patterns found across (mostly non-standard) varieties of German. In so doing, the data are used as a testing ground for different models of German clause structure. At first sight, the occurrence of 'do' in subordinate clauses, as found in many varieties, appears to support the standard CP-IP-VP analysis of German. In actual fact, however, the full range of data turn out to challenge, rather than support, this model. Instead, I propose an analysis within the IP-less model by Haider (Deutsche Syntax - generativ. Vorstudien zur Theorie einer projektiven Grammatik, Narr, Tübingen, 1993 et seq.). In sum, the 'do'-support data will be shown to have implications not only for the analysis of clause structure but also for the OT constraints commonly assumed to govern the distribution of 'do', for the theory of non-projecting words (Toivonen in Non-projecting words, Kluwer, Dordrecht, 2003) as well as research on grammaticalization.
The grammatical information system grammis combines descriptive texts on German grammar with dictionaries of specific word classes and grammatical terminology. In this paper, we describe the first attempts at analyzing user behavior for an online grammar of the German language and the implementation of an analysis and data extraction tool based on Matomo, a web analytics tool. We focus on the analysis of the keywords the users search for, either within grammis or via an external search platform like Google, and the analysis of the interaction between the text components within grammis and the integrated dictionaries. The overall results show that about 50% of the searches are for grammatical terms, and that the users shift from texts to dictionaries, mainly by using the integrated links to the dictionary of terminology within the texts. Based on these findings, we aim to improve grammis by extending its integrated dictionaries.