Refine
Year of publication
- 2022 (57) (remove)
Document Type
- Part of a Book (34)
- Article (7)
- Conference Proceeding (6)
- Book (5)
- Other (3)
- Doctoral Thesis (2)
Has Fulltext
- yes (57)
Keywords
- Korpus <Linguistik> (57) (remove)
Publicationstate
- Veröffentlichungsversion (57) (remove)
Reviewstate
Publisher
Fragen der Verdatung sind Bestandteil der digitalen Diskursanalyse und keine Vorarbeiten. Die Analyse digital(isiert)er Diskurse setzt im Unterschied zur Auswertung nicht-digital repräsentierter Sprache und Kommunikation notwendig technische Verfahren und Praktiken, Algorithmen und Software voraus, die den Untersuchungsgegenstand als digitales Datum konstituieren. Die nachfolgenden Abschnitte beschreiben kurz und knapp wiederkehrende Aspekte dieser Verdatungstechniken und -praktiken, insbesondere mit Blick auf Erhebung und Transformation (Abschnitt 2), Korpuskompilierung (Abschnitt 3), Annotation (Abschnitt 4) und Wege der analytischen Datenerschließung (Abschnitt 5). Im Fazit wird die Relevanz der Verdatungsarbeit für den Analyseprozess zusammengefasst (6).
Aus Platzgründen musste in der Druckfassung des Artikels „Beobachtungen zu Frequenz und Funktionen von ja in deutscher Spontansprache“ (in: Deutsche Sprache 50, S. 336–363; https://doi.org/10.37307/j.1868-775X.2022.04.04) auf den Abdruck der illustrierenden Abbildungen 2–18 im Abschnitt 5.2 verzichtet werden. Das entsprechende Kapitel inklusive aller Abbildungen ist hier abrufbar.
This paper arises within the current communication urgency experienced throughout the pandemic. From its onset, several new lexical units have permeated the overall media discourse, as well as social media and other channels. These units convey information to the public regarding the ‘severe acute respiratory syndrome’ namely COVID-19. In addition to its worldwide impact healthwise, the pandemic generates noteworthy influence in the linguistic landscape, and as a result, a significant number of neologisms have emerged. Within the scope of our ongoing research, we identify the neologisms in European Portuguese that are related to the term COVID-19 via form or meaning. However, not all the new lexical units identified in our corpus containing COVID-19 in its formation can unequivocally be regarded as neoterms (terminological neologisms). Accordingly, this article aims not only to reflect on the distinction between neologism and neoterm but also to explore the determinologisation process that several of these new lexical units experience.
This thesis is a corpus linguistic investigation of the language used by young German speakers online, examining lexical, morphological, orthographic, and syntactic features and changes in language use over time. The study analyses the language in the Nottinghamer Korpus deutscher YouTube‐Sprache ("Nottingham corpus of German YouTube language", or NottDeuYTSch corpus), one of the first large corpora of German‐language comments taken from the videosharing website YouTube, and built specifically for this project. The metadatarich corpus comprises c.33 million tokens from more than 3 million comments posted underneath videos uploaded by mainstream German‐language youthorientated YouTube channels from 2008‐2018.
The NottDeuYTSch corpus was created to enable corpus linguistic approaches to studying digital German youth language (Jugendsprache), having identified the need for more specialised web corpora (see Barbaresi 2019). The methodology for compiling the corpus is described in detail in the thesis to facilitate future construction of web corpora. The thesis is situated at the intersection of Computer‐Mediated Communication (CMC) and youth language, which have been important areas of sociolinguistic scholarship since the 1980s, and explores what we can learn from a corpus‐driven, longitudinal approach to (online) youth language. To do so, the thesis uses corpus linguistic methods to analyse three main areas:
1. Lexical trends and the morphology of polysemous lexical items. For this purpose, the analysis focuses on geil, one of the most iconic and productive words in youth language, and presents a longitudinal analysis, demonstrating that usage of geil has decreased, and identifies lexical items that have emerged as potential replacements. Additionally, geil is used to analyse innovative morphological productiveness, demonstrating how different senses of geil are used as a base lexeme or affixoid in compounding and derivation.
2. Syntactic developments. The novel grammaticalization of several subordinating conjunctions into both coordinating conjunctions and discourse markers is examined. The investigation is supported by statistical analyses that demonstrate an increase in the use of non‐standard syntax over the timeframe of the corpus and compares the results with other corpora of written language.
3. Orthography and the metacommunicative features of digital writing. This analysis identifies orthographic features and strategies in the corpus, e.g. the repetition of certain emoji, and develops a holistic framework to study metacommunicative functions, such as the communication of illocutionary force, information structure, or the expression of identities. The framework unifies previous research that had focused on individual features, integrating a wide range of metacommunicative strategies within a single, robust system of analysis.
By using qualitative and computational analytical frameworks within corpus linguistic methods, the thesis identifies emergent linguistic features in digital youth language in German and sheds further light on lexical and morphosyntactic changes and trends in the language of young people over the period 2008‐2018. The study has also further developed and augmented existing analytical frameworks to widen the scope of their application to orthographic features associated with digital writing.
Einleitung
(2022)
Einleitung
(2022)
Germany’s diverse history in the 20th century raises the question of how social upheavals were constituted in and through political discourse. By analysing basic concepts, the research network “The 20th century in basic concepts” (based at the Leibniz institutes IDS, ZfL, ZZF) aims to identify continuities and discontinuities in political and social discourse. In this way, historical sediments of the present are to be uncovered and those challenges identified that emerged in the course of the 20th century and continue to shape political discourse until the present.
Dieses Kapitel lotet Möglichkeiten und Methoden aus, digitale Diskursanalysen nationalsozialistischer Quellentexte durchzuführen. Digitale Technologie wird dabei als heuristisches Werkzeug betrachtet, mit dem der Sprachgebrauch während des Nationalsozialismus im Rahmen größerer Quellenkorpora untersucht werden kann. In einem theoretischen Abschnitt wird grundsätzlich dafür plädiert, während des Analyseprozesses hermeneutisches Sinnverstehen mit breitflächigen korpusbasierten Abfragen zu kombinieren. Verdeutlicht wird diese Herangehensweise an zwei empirischen Beispielen: Anhand eines Korpus von Hitler- und Goebbels-Reden wird dem Auftauchen und der diskursiven Ausgestaltung des nationalsozialistischen Konzepts „Lebensraum“ nachgespürt. Schritt für Schritt wird offengelegt, welche Analysewege durch das Abfragen von Schlüsseltexten, Keywords, Konkordanzen und Kollokationen verfolgt werden können. Das zweite Beispiel zeigt anhand von Eingaben, die aus der Bevölkerung an Staats- und Parteiinstanzen gerichtet wurden, wie solche Quellen mithilfe eines digitalen Tools manuell annotiert werden können, um sie danach auf Musterhaftigkeiten im Sprachgebrauch hin auswerten zu können.
This paper looks at whether, after two decades of corpus building for the Bantu languages, the time is ripe to begin using monitor corpora. As a proof-of-concept, the usefulness of a Lusoga monitor corpus for lexicographic purposes, in casu for the detection of neologisms, both in terms of new words and new meanings, is investigated and found useful.
This paper presents an algorithm and an implementation for efficient tokenization of texts of space-delimited languages based on a deterministic finite state automaton. Two representations of the underlying data structure are presented and a model implementation for German is compared with state-of-the-art approaches. The presented solution is faster than other tools while maintaining comparable quality.