Refine
Year of publication
Document Type
- Doctoral Thesis (33)
- Book (7)
- Habilitation (3)
- Article (1)
Keywords
- Deutsch (14)
- Korpus <Linguistik> (8)
- Wortschatz (6)
- Computerlinguistik (4)
- Englisch (4)
- Konversationsanalyse (4)
- Syntax (4)
- Geschichte (3)
- Gespräch (3)
- Interaktion (3)
Publicationstate
Reviewstate
- Qualifikationsarbeit (Dissertation, Habilitationsschrift) (44) (remove)
Publisher
- Lehrstuhl für deutsche Sprache des Omsker staatlichen pädagogischen Gorki-Instituts (2)
- Universität Mannheim (2)
- Universität Potsdam (2)
- Bielefeld University (1)
- Collibri-Verlag (1)
- Deutscher Universitätsverlag (1)
- Dublin City University (1)
- Frank & Timme (1)
- Freie Universität Berlin (1)
- IDS-Verlag (1)
Im Zentrum der Dissertation steht der Begriff Informationsmodellierung oder genauer der Begriff der "textuellen Informationsmodellierung", wobei auf einer bereits vorgeschlagenen Unterscheidung einer primären und einer sekundären Ebene der Informationsstrukturierung aufgebaut wird. Der Gegenstand der primären Ebene sind die textuellen Daten selbst sowie ihre Strukturierung, wohingegen die sekundäre Ebene beschreibt, wie die für die primären Ebenen verwendeten Regelwerke mit alternativen Regelwerken in Beziehung gesetzt werden können. Der Einteilung in eine primäre und eine sekundäre Informationsstrukturierung wird in der Dissertation das Konzept der multiplen Informationsstrukturierung nebengeordnet. Dieses Konzept ist so zu verstehen, dass die primäre Ebene bei Bedarf vervielfacht wird - jedoch bezieht sich jede dieser Ebenen auf dieselbe Datengrundlage. Hierbei ergeben sich auch Auswirkungen auf die sekundäre Informationsstrukturierung. Die Informationsmodellierung erfolgt mit Auszeichnungssprachen. Die Standard Generalized Markup Language (SGML) stellt hierfür einen Rahmen dar, jedoch wurde dieser Formalismus seit seiner 1986 erfolgten Standardisierung nicht nur weiterentwickelt, sondern es wurde mit der Extensible Markup Language (XML) im Jahr 1998 eine wesentlich einfachere Untermenge dieser Sprache definiert, die zudem das derzeitige Zentrum weiterer Entwicklungen auf dem Gebiet der Auszeichnungssprachen darstellt. Der entwickelte Ansatz zur Modellierung linguistischer Information basiert auf der Extensible Markup Language (XML), wobei die weitergehenden Möglichkeiten von SGML selbstverständlich ebenfalls dargestellt und diskutiert werden. Mittels XML können Informationen, die sich nicht in bestimmten Hierarchien (mittels mathematischer Bäume) strukturieren lassen, nicht in einer natürlichen Weise repräsentiert werden. Eine Lösung dieses Problems liegt in der Aufteilung der Strukturierung auf verschiedene Ebenen. Diese neue Lösung wird dargestellt, diskutiert und modelliert.
This dissertation investigates discourse-pragmatic differences between variably linked arguments appearing in alternating argument structure constructions in the sense of Goldberg (1995) and Kay (manuscript). The properties that are studied include givenness, pragmatic relation (topic/focus), salience of referents, animacy, and others. They derive from the literature on sentence-type constructions such as topicalization and from research on the referential properties of NP form types.
The research carried out here has multiple uses. At the most basic level, it serves as an empirical check on existing characterizations of the pragmatic properties of the relevant arguments that are the result of syntactic and semantic analysis based on introspection alone. For instance, for the epistemic raising alternation involving verbs like seem, the predicted topicality difference between the subjects of the raised and unraised constructions (Langacker 1995) could not be confirmed.
This dissertation also addresses the question what kinds of pragmatic factors, if any, are relevant to argument structure constructions. Based on the evidence of the dative alternation, it does not seem to be the case that the kind of pragmatic influences on argument structure constructions are different or limited compared to the ones found to be relevant to sentence-type constructions.
The kind of research undertaken here can also inform the syntactic and semantic analysis of constructions. In the case of the dative alternation, the discourse-pragmatic characteristics of the variably linked arguments provide evidence that Basilico’s (1998) analysis of the difference between the alternates in terms of VP-shells and a difference between thetic and categorical ‘inner’ predication, on the one hand does not account for all the data and on the other can be re-stated in pragmatic terms other than the thetic-categorical distinction.
In addition to studies of valence alternations, this dissertation also discusses various null instantiation phenomena, which provide further evidence for the need to specify discourse-pragmatic properties as part of argument structure constructions and lexical entries.
Finally, it is suggested that the use of randomly sampled corpus data and statistical modelling throughout this dissertation improves both empirical and analytical coverage.
Kann man den Sprachgebrauch in einer Gruppe verändern? Und wenn ja, wie? In Politik und Wirtschaft sind schlüssige Antworten auf diese Fragen von großem Interesse. Karolina Suchowolec findet sie, indem sie den aktuellen Forschungsstand zu Sprachplanung, Plansprachen, Kontrollierten Sprachen und Terminologiearbeit analysiert, die Erkenntnisse auf ihre mögliche Verallgemeinerung hin prüft und daraus Sprachlenkung als einen übergreifenden linguistischen Forschungsgegenstand ableitet. Dessen praktische Umsetzung hat sie empirisch untersucht. Im Ergebnis formuliert sie eine Übersicht zu den Herausforderungen der Sprachlenkung sowie zu in der Literatur postulierten Lösungsansätzen – eine solide Grundlage für die weitere theoretische Forschung sowie Hilfestellung für die praktische Sprachlenkung.
Deutsch in Kamerun
(1998)
Die vorliegende Dissertation beschäftigt sich mit verschieden Methoden zur Erhebung von perzeptuellen Prominenzurteilen von naiven Hörern im Deutschen. Es werden zwei Experimente vorgestellt, die sich zum einen mit der Verwendung von verschiedenen Skalen, zum anderen mit der Verwendung von unterschiedlichen Bewertungsebenen zur Beurteilung von perzeptueller Prominenz beschäftigen. Die Ergebnisse zeigen, dass Ergebnisse von Studien, welche auf unterschiedlichen Erhebungstechniken beruhen nicht ohne weiteres vergleichbar sind. Die Arbeit untersucht außerdem die Effekte einer Normalisierung der Prominenzurteile. Die Dissertation schließt mit einem Ausblick für zukünftige Studien. Hierbei werden hauptsächlich die vielfältigen Interaktionen von verschiedenen Quellen und dem Kontext bei der Beurteilung der perzeptuellen Prominenz adressiert.
Manual development of deep linguistic resources is time-consuming and costly and therefore often described as a bottleneck for traditional rule-based NLP. In my PhD thesis I present a treebank-based method for the automatic acquisition of LFG resources for German. The method automatically creates deep and rich linguistic presentations from labelled data (treebanks) and can be applied to large data sets. My research is based on and substantially extends previous work on automatically acquiring wide-coverage, deep, constraint-based grammatical resources from the English Penn-II treebank (Cahill et al.,2002; Burke et al., 2004; Cahill, 2004). Best results for English show a dependency f-score of 82.73% (Cahill et al., 2008) against the PARC 700 dependency bank, outperforming the best hand-crafted grammar of Kaplan et al. (2004). Preliminary work has been carried out to test the approach on languages other than English, providing proof of concept for the applicability of the method (Cahill et al., 2003; Cahill, 2004; Cahill et al., 2005). While first results have been promising, a number of important research questions have been raised. The original approach presented first in Cahill et al. (2002) is strongly tailored to English and the datastructures provided by the Penn-II treebank (Marcus et al., 1993). English is configurational and rather poor in inflectional forms. German, by contrast, features semi-free word order and a much richer morphology. Furthermore, treebanks for German differ considerably from the Penn-II treebank as regards data structures and encoding schemes underlying the grammar acquisition task. In my thesis I examine the impact of language-specific properties of German as well as linguistically motivated treebank design decisions on PCFG parsing and LFG grammar acquisition. I present experiments investigating the influence of treebank design on PCFG parsing and show which type of representations are useful for the PCFG and LFG grammar acquisition tasks. Furthermore, I present a novel approach to cross-treebank comparison, measuring the effect of controlled error insertion on treebank trees and parser output from different treebanks. I complement the cross-treebank comparison by providing a human evaluation using TePaCoC, a new testsuite for testing parser performance on complex grammatical constructions. Manual evaluation on TePaCoC data provides new insights on the impact of flat vs. hierarchical annotation schemes on data-driven parsing. I present treebank-based LFG acquisition methodologies for two German treebanks. An extensive evaluation along different dimensions complements the investigation and provides valuable insights for the future development of treebanks.
This thesis investigates temporal and aspectual reference in the typologically unrelated African languages Hausa (Chadic, Afro–Asiatic) and Medumba (Grassfields Bantu). It argues that Hausa is a genuinely tenseless language and compares the interpretation of temporally unmarked sentences in Hausa to that of morphologically tenseless sentences in Medumba, where tense marking is optional and graded. The empirical behavior of the optional temporal morphemes in Medumba motivates an analysis as existential quantifiers over times and thus provides new evidence suggesting that languages vary in whether their (past) tense is pronominal or quantificational (see also Sharvit 2014). The thesis proposes for both Hausa and Medumba that the alleged future tense marker is a modal element that obligatorily combines with a prospective future shifter (which is covert in Medumba). Cross-linguistic variation in whether or not a future marker is compatible with non-future interpretation is proposed to be predictable from the aspectual architecture of the given language.
This dissertation offers a qualitative analysis of verbal interactions in German television talk shows between 1989 and 1994. It investigates how Speakers of German formulate their own and others’ affiliation to national identities and social spaces. In particular, it examines classifications of place, person, and time that include group and place names as well as grammatically complex expressions, deictic pronouns and adverbs, and certain motion verbs. In addition, repair is discussed as a resource in re-formulating identities.
This thesis consists of the following three papers that all have been published in international peer-reviewed journals:
Chapter 3: Koplenig, Alexander (2015c). The Impact of Lacking Metadata for the Measurement of Cultural and Linguistic Change Using the Google Ngram Data Sets—Reconstructing the Composition of the German Corpus in Times of WWII. Published in: Digital Scholarship in the Humanities. Oxford: Oxford University Press. [doi:10.1093/llc/fqv037]
Chapter 4: Koplenig, Alexander (2015b). Why the quantitative analysis of dia-chronic corpora that does not consider the temporal aspect of time-series can lead to wrong conclusions. Published in: Digital Scholarship in the Humanities. Oxford: Oxford University Press. [doi:10.1093/llc/fqv030]
Chapter 5: Koplenig, Alexander (2015a). Using the parameters of the Zipf–Mandelbrot law to measure diachronic lexical, syntactical and stylistic changes – a large-scale corpus analysis. Published in: Corpus Linguistics and Linguistic Theory. Berlin/Boston: de Gruyter. [doi:10.1515/cllt-2014-0049]
Chapter 1 introduces the topic by describing and discussing several basic concepts relevant to the statistical analysis of corpus linguistic data. Chapter 2 presents a method to analyze diachronic corpus data and a summary of the three publications. Chapters 3 to 5 each represent one of the three publications. All papers are printed in this thesis with the permission of the publishers.
The principal claim of this dissertation is that there is a unique structural core shared by Double Object, Dative Experiencer and Existential/Presentational constructions. This core is argued to take the form of a Cipient Predication structure, `cipient covering traditional notions like (affected) source/goal, recipient, indirect object or dative experiencer. Central questions arising in defining Cipient Predication are: How are cipients thematically licensed, and what is the role of there in argument-structural terms? What is the structural locus of cipients/there? What is the role and nature of dative case? How can the possessive interpretation, the blocking and definiteness effects associated with the above-mentioned constructions be explained? Cipients are presented as external arguments and logical subjects (location individuals) of predicates derived from a propositional meaning embedded in the VP, the predicate formed by a lower tense head `little t that is overtly realized as there. Little t is argued to encode a distinction at the reference time level, structural dative hinging on a tense property like structural nominative. The cipient relates as a whole to a part to a VP-internal location argument that together with the theme furnishes the propositional meaning (`possession ). As logical subjects, cipients anchor the predicate to the utterance context, forcing its interpretation in extralinguistic terms (`blocking effects ). It is proposed that lacking structurally encoded subjects, Existential/Presentational constructions are not saturated expressions in syntax, precluding the interpretation of certain quantifiers (most/every, vide `definiteness effects ). Cipient Predication, couched in terms of the Minimalist Program (in particular, Chomsky 1999) and a semantics relying on tense and the ontological distinction of locations as well as scalar and part-whole structure, should be of interest to scholars working on datives, argument structure, and the syntax/semantics/pragmatics interface more generally.
This is a study of how aspects of information structure can be captured within a formal grammar of Spanish, couched in the framework of Head-Driven Phrase Structure Grammar (HPSG, Pollard
and Sag 1994). While a large number of morphological, syntactic and semantic aspects in a variety of languages have been successfully analysed in this theory, information structure has not been paid the same attention in the HPSG literature. However, as a theory of signs, HPSG should include all
levels of description without which the structural descriptions offered by the grammar would ultimately remain incomplete. Languages often explicitly mark the information-structural partitioning of utterances. Depending on the particular language, linguistic resources used for this purpose include
prosody (stress/intonation), syntax (e. g. constituent order, special syntactic constructions) and morphology (e. g. special affixes). In HPSG, phonological, syntactic, semantic and pragmatic information is represented in parallel, which would seem to be a well-suited architecture for modelling
the sort of interfaces called for.
Understanding the design of talk-in-interaction is important in many domains, including speech technology. Although phonetic, linguistic and gestural correlates have been identified for some of the social actions that conversational participants accomplish, it is only recently that researchers have begun to take account of the immediately prior interactional context as an important factor influencing the design of a speaker’s turn. The present study explores the influence of context by focussing on characteristics of short turns produced by one speaker between turns from another speaker. The hypothesis is that the speaker designs her inserted turn as a match to the prior turn when wishing to align with the previous speaker’s agenda. By contrast, non-matching would display that the speaker is non-aligning, preferring instead to initiate a new action for example. Data are taken from the AMI corpus, focussing on the spontaneous talk of first-language English participants. Using sequential analysis, such short turns are classified as either aligning or non-aligning in accordance with definitions in the Conversation Analysis literature. The degree of prosodic similarity between the inserted turn and the prior speaker’s turn is measured using novel acoustic techniques. The results show that aligning turns are significantly more similar to the immediately preceding turn, in terms of pitch contour, than non-aligning turns. In contrast to the prosodic-acoustic analysis, the results of the gestural analysis indicate that aligning and non-aligning are differentiated by the use of distinct gestures, rather than by the matching (or non-matching) of gestures across the adjacent turns. These results support the view that choice of pitch contour is managed locally, rather than by reference to an intonational lexicon. However, this is not the case for speakers’ use of gesture. The implications of these findings for a model of talk-in-interaction are considered, along with potential applications.