Refine
Year of publication
Document Type
- Doctoral Thesis (16) (remove)
Language
- English (16) (remove)
Keywords
- Korpus <Linguistik> (5)
- Deutsch (4)
- Englisch (4)
- Syntax (3)
- Aspekt <Linguistik> (2)
- Computerlinguistik (2)
- Dialog (2)
- Formale Semantik (2)
- Grammatik (2)
- Konversationsanalyse (2)
Publicationstate
- Veröffentlichungsversion (16) (remove)
Reviewstate
Publisher
- Universität Potsdam (2)
- Dublin City University (1)
- Freie Universität Berlin (1)
- LOT (1)
- LOT Publications (1)
- Marburg/Lahn (1)
- Radboud Universiteit Nijmegen (1)
- University of Gothenburg (1)
- University of Illinois (1)
- University of Nottingham (1)
This thesis is a corpus linguistic investigation of the language used by young German speakers online, examining lexical, morphological, orthographic, and syntactic features and changes in language use over time. The study analyses the language in the Nottinghamer Korpus deutscher YouTube‐Sprache ("Nottingham corpus of German YouTube language", or NottDeuYTSch corpus), one of the first large corpora of German‐language comments taken from the videosharing website YouTube, and built specifically for this project. The metadatarich corpus comprises c.33 million tokens from more than 3 million comments posted underneath videos uploaded by mainstream German‐language youthorientated YouTube channels from 2008‐2018.
The NottDeuYTSch corpus was created to enable corpus linguistic approaches to studying digital German youth language (Jugendsprache), having identified the need for more specialised web corpora (see Barbaresi 2019). The methodology for compiling the corpus is described in detail in the thesis to facilitate future construction of web corpora. The thesis is situated at the intersection of Computer‐Mediated Communication (CMC) and youth language, which have been important areas of sociolinguistic scholarship since the 1980s, and explores what we can learn from a corpus‐driven, longitudinal approach to (online) youth language. To do so, the thesis uses corpus linguistic methods to analyse three main areas:
1. Lexical trends and the morphology of polysemous lexical items. For this purpose, the analysis focuses on geil, one of the most iconic and productive words in youth language, and presents a longitudinal analysis, demonstrating that usage of geil has decreased, and identifies lexical items that have emerged as potential replacements. Additionally, geil is used to analyse innovative morphological productiveness, demonstrating how different senses of geil are used as a base lexeme or affixoid in compounding and derivation.
2. Syntactic developments. The novel grammaticalization of several subordinating conjunctions into both coordinating conjunctions and discourse markers is examined. The investigation is supported by statistical analyses that demonstrate an increase in the use of non‐standard syntax over the timeframe of the corpus and compares the results with other corpora of written language.
3. Orthography and the metacommunicative features of digital writing. This analysis identifies orthographic features and strategies in the corpus, e.g. the repetition of certain emoji, and develops a holistic framework to study metacommunicative functions, such as the communication of illocutionary force, information structure, or the expression of identities. The framework unifies previous research that had focused on individual features, integrating a wide range of metacommunicative strategies within a single, robust system of analysis.
By using qualitative and computational analytical frameworks within corpus linguistic methods, the thesis identifies emergent linguistic features in digital youth language in German and sheds further light on lexical and morphosyntactic changes and trends in the language of young people over the period 2008‐2018. The study has also further developed and augmented existing analytical frameworks to widen the scope of their application to orthographic features associated with digital writing.
A tale of many stories: explaining policy diffusion between European higher education systems
(2013)
The thesis ”A Tale of Many Stories - Explaining Policy Diffusion between European Higher Education Systems" systematically examines diffusion processes and their effects with regard to a rather neglected policy area – the case of European higher education policy. The thesis contributes to the slowly growing number of comparative and mechanism-based studies on policy diffusion and represents the first study on the diffusion of policies between European Higher Education Systems. The main aim is to contrast and compare testable and coherent explanatory models on the functioning of different diffusion mechanisms. Three sets of explanatory models on the relationship between variables triggering and conditioning diffusion mechanisms and their impact on policy adoption are drawn from mechanism-based thinking on policy diffusion: on learning, socialization, and externalities. These approaches conceptualize the policy process in terms of interdependencies between international and national actors. Explanatory models based on assumptions about domestic policies and the common responses of countries to similar policy problems extend this theoretical framework. The thesis is based on event history modelling of policy change and adoption in higher education systems of 16 West European countries between the yeas 1980 and 1998. Overall 14 policy items describing performance-orientated reforms for public universities ranging from the adoption of external quality assurance systems to tuition fees are examined. Empirically, the main research question is what international, national and policy-specific factors cause and condition diffusion processes and the adoption of public policies? Evidence can be found for and against all of the four theoretical approaches tested. In comparison, many of the assumptions related to interdependencies lack robustness, whereas the common response model is the most stable one. This does not mean that explanatory models based on interdependent decision-making are not suitable for analysing policy diffusion in higher education. Rather interdependency is a multi- dimensional concept that requires a comparative assessment of diffusion mechanisms. Some of explanatory factors based on interdependent decision- making are still supported by the empirical analysis though. From this point of view, the recommendation for analysing diffusion is to start with a model based on domestic politics, that is successively extended by explanatory factors dealing with interdependencies between international and national actors. Diffusion variables matter – but it is only one side of the tale on policy diffusion.
This thesis consists of the following three papers that all have been published in international peer-reviewed journals:
Chapter 3: Koplenig, Alexander (2015c). The Impact of Lacking Metadata for the Measurement of Cultural and Linguistic Change Using the Google Ngram Data Sets—Reconstructing the Composition of the German Corpus in Times of WWII. Published in: Digital Scholarship in the Humanities. Oxford: Oxford University Press. [doi:10.1093/llc/fqv037]
Chapter 4: Koplenig, Alexander (2015b). Why the quantitative analysis of dia-chronic corpora that does not consider the temporal aspect of time-series can lead to wrong conclusions. Published in: Digital Scholarship in the Humanities. Oxford: Oxford University Press. [doi:10.1093/llc/fqv030]
Chapter 5: Koplenig, Alexander (2015a). Using the parameters of the Zipf–Mandelbrot law to measure diachronic lexical, syntactical and stylistic changes – a large-scale corpus analysis. Published in: Corpus Linguistics and Linguistic Theory. Berlin/Boston: de Gruyter. [doi:10.1515/cllt-2014-0049]
Chapter 1 introduces the topic by describing and discussing several basic concepts relevant to the statistical analysis of corpus linguistic data. Chapter 2 presents a method to analyze diachronic corpus data and a summary of the three publications. Chapters 3 to 5 each represent one of the three publications. All papers are printed in this thesis with the permission of the publishers.
This dissertation offers a qualitative analysis of verbal interactions in German television talk shows between 1989 and 1994. It investigates how Speakers of German formulate their own and others’ affiliation to national identities and social spaces. In particular, it examines classifications of place, person, and time that include group and place names as well as grammatically complex expressions, deictic pronouns and adverbs, and certain motion verbs. In addition, repair is discussed as a resource in re-formulating identities.
The principal claim of this dissertation is that there is a unique structural core shared by Double Object, Dative Experiencer and Existential/Presentational constructions. This core is argued to take the form of a Cipient Predication structure, `cipient covering traditional notions like (affected) source/goal, recipient, indirect object or dative experiencer. Central questions arising in defining Cipient Predication are: How are cipients thematically licensed, and what is the role of there in argument-structural terms? What is the structural locus of cipients/there? What is the role and nature of dative case? How can the possessive interpretation, the blocking and definiteness effects associated with the above-mentioned constructions be explained? Cipients are presented as external arguments and logical subjects (location individuals) of predicates derived from a propositional meaning embedded in the VP, the predicate formed by a lower tense head `little t that is overtly realized as there. Little t is argued to encode a distinction at the reference time level, structural dative hinging on a tense property like structural nominative. The cipient relates as a whole to a part to a VP-internal location argument that together with the theme furnishes the propositional meaning (`possession ). As logical subjects, cipients anchor the predicate to the utterance context, forcing its interpretation in extralinguistic terms (`blocking effects ). It is proposed that lacking structurally encoded subjects, Existential/Presentational constructions are not saturated expressions in syntax, precluding the interpretation of certain quantifiers (most/every, vide `definiteness effects ). Cipient Predication, couched in terms of the Minimalist Program (in particular, Chomsky 1999) and a semantics relying on tense and the ontological distinction of locations as well as scalar and part-whole structure, should be of interest to scholars working on datives, argument structure, and the syntax/semantics/pragmatics interface more generally.
The thesis describes a fully automatic system for the resolution of the pronouns 'it', 'this', and 'that' in English unrestricted multi-party dialog. Referential relations considered include both normal NP-antecedence as well as discourse-deictic pronouns. The thesis contains a theoretical part with a comprehensive empiricial study, and a practical part describing machine learning experiments.
This thesis describes work in three areas: grammar engineering, computer-assisted language learning and grammar learning. These three parts are connected by the concept of a grammar-based language learning application. Two types of grammars are of concern. The first we call resource grammars, extensive descriptions a natural languages. Part I focuses on this kind of grammars. The other are domain-specific or application-specific grammars. These grammars only describe a fragment of natural language that is determined by the domain of a certain application. Domain-specific grammars are relevant for Part II and Part III. Another important distinction is between humans learning a new natural language using computational grammars (Part II) and computers learning grammars from example sentences (Part III). Part I of this thesis focuses on grammar engineering and grammar testing. It describes the development and evaluation of a computational resource grammar for Latin. Latin is known for its rich morphology and free word order, both have to be handled in a computationally efficient way. A special focus is on methods how computational grammars can be evaluated using corpus data. Such an evaluation is presented for the Latin resource grammar. Part II, the central part, describes a computer-assisted language learning application based on domain-specific grammars. The language learning application demonstrates how computational grammars can be used to guide the user input and how language learning exercises can be modeled as grammars. This allows us to put computational grammars in the center of the design of language learning exercises used to help humans learn new languages. Part III, the final part, is dedicated to a method to learn domain- or application-specific grammars based on a wide-coverage grammar and small sets of example sentences. Here a computer is learning a grammar for a fragment of a natural language from example sentences, potentially without any additional human intervention. These learned grammars can be based e.g. on the Latin resource grammar described in Part II and used as domain-specific lesson grammars in the language learning application described Part II.
This is a study of how aspects of information structure can be captured within a formal grammar of Spanish, couched in the framework of Head-Driven Phrase Structure Grammar (HPSG, Pollard
and Sag 1994). While a large number of morphological, syntactic and semantic aspects in a variety of languages have been successfully analysed in this theory, information structure has not been paid the same attention in the HPSG literature. However, as a theory of signs, HPSG should include all
levels of description without which the structural descriptions offered by the grammar would ultimately remain incomplete. Languages often explicitly mark the information-structural partitioning of utterances. Depending on the particular language, linguistic resources used for this purpose include
prosody (stress/intonation), syntax (e. g. constituent order, special syntactic constructions) and morphology (e. g. special affixes). In HPSG, phonological, syntactic, semantic and pragmatic information is represented in parallel, which would seem to be a well-suited architecture for modelling
the sort of interfaces called for.