Refine
Year of publication
- 2019 (106) (remove)
Document Type
- Article (37)
- Conference Proceeding (34)
- Part of a Book (25)
- Book (5)
- Other (2)
- Part of Periodical (1)
- Review (1)
- Working Paper (1)
Language
- English (106) (remove)
Keywords
- Korpus <Linguistik> (31)
- Deutsch (29)
- Automatische Sprachanalyse (12)
- Gesprochene Sprache (10)
- Computerlinguistik (8)
- corpus linguistics (8)
- Konversationsanalyse (7)
- Sprachpolitik (7)
- Annotation (6)
- Englisch (6)
Publicationstate
- Veröffentlichungsversion (57)
- Zweitveröffentlichung (38)
- Postprint (21)
Reviewstate
- Peer-Review (77)
- (Verlags)-Lektorat (18)
- (Verlags-)Lektorat (1)
- Peer review (1)
- Peer-review (1)
Publisher
- de Gruyter (10)
- Leibniz-Institut für Deutsche Sprache (8)
- German Society for Computational Linguistics & Language Technology und Friedrich-Alexander-Universität Erlangen-Nürnberg (7)
- Lexical Computing CZ s.r.o. (6)
- Editura Academiei Române (5)
- Springer (5)
- Elsevier (4)
- Palgrave Macmillan (4)
- The Association for Computational Linguistics (4)
- Routledge, Taylor & Francis Group (3)
Language attitudes matter; they influence people’s behaviour and decisions. Therefore, it is crucial to learn more about patterns in the way that languages are evaluated. One means of doing so is using a quantitative approach with data representative of a whole population, so that results mirror dispositions at a societal level. This kind of approach is adopted here, with a focus on the situation in Germany. The article consists of two parts. First, I will present some results of a new representative survey on language attitudes in Germany (the Germany Survey 2017). Second, I will show how language attitudes penetrate even seemingly objective data collection processes by examining the German Microcensus. In 2017, for the first time in eighty years, the German Microcensus included a question on language use ‘at home’. Unfortunately, however, the question was clearly tainted by language attitudes instead of being objective. As a result, the Microcensus significantly misrepresents the linguistic reality of different migrant languages spoken in Germany.
Although the N400 was originally discovered in a paradigm designed to elicit a P300 (Kutas and Hillyard, 1980), its relationship with the P300 and how both overlapping event-related potentials (ERPs) determine behavioral profiles is still elusive. Here we conducted an ERP (N = 20) and a multiple-response speed-accuracy tradeoff (SAT) experiment (N = 16) on distinct participant samples using an antonym paradigm (The opposite of black is white/nice/yellow with acceptability judgment). We hypothesized that SAT profiles incorporate processes of task-related decision-making (P300) and stimulus-related expectation violation (N400). We replicated previous ERP results (Roehm et al., 2007): in the correct condition (white), the expected target elicits a P300, while both expectation violations engender an N400 [reduced for related (yellow) vs. unrelated targets (nice)]. Using multivariate Bayesian mixed-effects models, we modeled the P300 and N400 responses simultaneously and found that correlation between residuals and subject-level random effects of each response window was minimal, suggesting that the components are largely independent. For the SAT data, we found that antonyms and unrelated targets had a similar slope (rate of increase in accuracy over time) and an asymptote at ceiling, while related targets showed both a lower slope and a lower asymptote, reaching only approximately 80% accuracy. Using a GLMM-based approach (Davidson and Martin, 2013), we modeled these dynamics using response time and condition as predictors. Replacing the predictor for condition with the averaged P300 and N400 amplitudes from the ERP experiment, we achieved identical model performance. We then examined the piecewise contribution of the P300 and N400 amplitudes with partial effects (see Hohenstein and Kliegl, 2015). Unsurprisingly, the P300 amplitude was the strongest contributor to the SAT-curve in the antonym condition and the N400 was the strongest contributor in the unrelated condition. In brief, this is the first demonstration of how overlapping ERP responses in one sample of participants predict behavioral SAT profiles of another sample. The P300 and N400 reflect two independent but interacting processes and the competition between these processes is reflected differently in behavioral parameters of speed and accuracy.
Preface
(2019)
In this paper, we describe a data processing pipeline used for annotated spoken corpora of Uralic languages created in the INEL (Indigenous Northern Eurasian Languages) project. With this processing pipeline we convert the data into a loss-less standard format (ISO/TEI) for long-term preservation while simultaneously enabling a powerful search in this version of the data. For each corpus, the input we are working with is a set of files in EXMARaLDA XML format, which contain transcriptions, multimedia alignment, morpheme segmentation and other kinds of annotation. The first step of processing is the conversion of the data into a certain subset of TEI following the ISO standard ’Transcription of spoken language’ with the help of an XSL transformation. The primary purpose of this step is to obtain a representation of our data in a standard format, which will ensure its long-term accessibility. The second step is the conversion of the ISO/TEI files to a JSON format used by the “Tsakorpus” search platform. This step allows us to make the corpora available through a web-based search interface. As an addition, the existence of such a converter allows other spoken corpora with ISO/TEI annotation to be made accessible online in the future.
As the Web ought to be considered as a series of sources rather than as a source in itself, a problem facing corpus construction resides in meta-information and categorization. In addition, we need focused data to shed light on particular subfields of the digital public sphere. Blogs are relevant to that end, especially if the resulting web texts can be extracted along with metadata and made available in coherent and clearly describable collections.
Speech planning is a sophisticated process. In dialog, it regularly starts in overlap with an incoming turn by a conversation partner. We show that planning spoken responses in overlap with incoming turns is associated with higher processing load than planning in silence. In a dialogic experiment, participants took turns with a confederate describing lists of objects. The confederate’s utterances (to which participants responded) were pre-recorded and varied in whether they ended in a verb or an object noun and whether this ending was predictable or not. We found that response planning in overlap with sentence-final verbs evokes larger task-evoked pupillary responses, while end predictability had no effect. This finding indicates that planning in overlap leads to higher processing load for next speakers in dialog and that next speakers do not proactively modulate the time course of their response planning based on their predictions of turn endings. The turn-taking system exerts pressure on the language processing system by pushing speakers to plan in overlap despite the ensuing increase in processing load.
Since 2013 representatives of several French and German CMC corpus projects have developed three customizations of the TEI-P5 standard for text encoding in order to adapt the encoding schema and models provided by the TEI to the structural peculiarities of CMC discourse. Based on the three schema versions, a 4th version has been created which takes into account the experiences from encoding our corpora and which is specifically designed for the submission of a feature request to the TEI council. On our poster we would present the structure of this schema and its relations (commonalities and differences) to the previous schemas.
In this paper, we investigate the temporal interpretation of propositional attitude complement clauses in four typologically unrelated languages: Washo (language isolate), Medumba (Niger-Congo), Hausa (Afro-Asiatic), and Samoan (Austronesian). Of these languages, Washo and Medumba are optional-tense languages, while Hausa and Samoan are tenseless. Just like in obligatory-tense languages, we observe variation among these languages when it comes to the availability of so-called simultaneous and backward-shifted readings of complement clauses. For our optional-tense languages, we argue that a Sequence of Tense parameter is active in these languages, just as in obligatory-tense languages. However, for completely tenseless clauses, we need something more. We argue that there is variation in the degree to which languages make recourse to res-movement, or a similar mechanism that manipulates LF structures to derive backward-shifted readings in tenseless complement clauses. We additionally appeal to cross-linguistic variation in the lexical semantics of perfective aspect to derive or block certain readings. The result is that the typological classification of a language as tensed, optionally tensed, or tenseless, does not alone determine the temporal interpretation possibilities for complement clauses. Rather, structural parameters of variation cross-cut these broad classes of languages to deliver the observed cross-linguistic picture.