Refine
Year of publication
- 2007 (15) (remove)
Document Type
- Conference Proceeding (15) (remove)
Language
- English (15)
Has Fulltext
- yes (15)
Is part of the Bibliography
- no (15) (remove)
Keywords
- Korpus <Linguistik> (10)
- Deutsch (4)
- Annotation (3)
- Syntaktische Analyse (3)
- Kollokation (2)
- Metadaten (2)
- Anapher <Syntax> (1)
- Antonym (1)
- Archiv für Gesprochenes Deutsch (AGD) (1)
- Auszeichnungssprache (1)
Publicationstate
- Veröffentlichungsversion (15) (remove)
Reviewstate
- (Verlags)-Lektorat (10)
- Peer-Review (2)
- Review-Status-unbekannt (1)
Publisher
- University of Birmingham (3)
- University of Illinois (3)
- Association for Computational Linguistics (2)
- Extreme Markup Languages Conference (1)
- Northern European Association for Language Technology (1)
- Universita degli Studi di Bologna (1)
- University of Brimingham (1)
- University of Helsinki (1)
- University of Tartu (1)
- Universität des Saarlandes (1)
This paper aims to address these problems by dealing with theoretical and methodological questions concerning the national effects of the Bologna Process and the role national factors play in determining the impact of these effects. Altogether the purpose of the paper is to serve as a starting point for future research – both as a guide for systematic and comparative empirical work on higher education, but also for further theoretical and methodological reasoning concerning research on (higher) education policy. As higher education research so far particularly lacks an approach allowing for a competitive and systematic falsification of theoretical arguments by clearly indicating testable and specific hypothesis as well as variables behind the research design (Goedegebuure/Vught 1996) we propose to fall back on neighbouring disciplines, namely social science to improve and enhance the analysis (Slaughter 2001: 398; Altbach 2002: 154; Teichler 1996a: 433, 2005: 448). Several strands of research have to be considered – namely literature on Europeanization as well as insights and approaches of studies dealing with cross-national policy convergence. Taking into account the non-obligatory and mainly intergovernmental character of the Bologna Process the main focus of the paper is on factors related to the effects of transnational communication. The inherent goal is to extend the research agenda on higher education (McLendon 2003: 184ff) and to leave behind the restriction of to analyse only a few cases by striving for a research design that allows for systematic testing and sufficient explanations of cross-national policy convergence at the interface between the Bologna Process and domestic factors.
We present a corpus-driven approach to the study of multi-word expressions, which constitute a significant part of. As a data basis, we use collocation profiles computed from DeReKo (Deutsches Referenzkorpus), the largest available collection of written German which has approximately two billion word tokens and is located at the Institute for the German Language (IDS). We employ a strongly usage-based approach to multi-word expressions, which we think of as conventionalised patterns in language use that manifest themselves in recurrent syntagmatic patterns of words. They are defined by their distinct function in language. To find multi-word expressions, we allow ourselves to be guided by corpus data and statistical evidence as much as possible, making interpretative steps carefully and in a monitored fashion. We develop a procedure of interpretation that leads us from the evidence of collocation profiles to a collection of recurrent word patterns and finally to multi-word expressions. When building up a collection of multi-word expressions in this fashion, it becomes clear that the expressions can be defined on different levels of generalisation and are interrelated in various ways. This will be reflected in the documentation and presentation of the findings. We are planning to add annotation in a way that allows grouping the multi-word expressions according to different features and to add links between them to reflect their relationships, thus constructing a network of multi-word expressions.
This paper presents a thorough examination of the validity of three evaluation measures on parser output. We assess parser performance of an unlexicalised probabilistic parser trained on two German treebanks with different annotation schemes and evaluate parsing results using the PARSEVAL metric, the Leaf-Ancestor metric and a dependency-based evaluation. We reject the claim that the TüBa-D/Z annotation scheme is more adequate then the TIGER scheme for PCFG parsing and show that PARSEVAL should not be used to compare parser performance for parsers trained on treebanks with different annotation schemes. An analysis of specific error types indicates that the dependency-based evaluation is most appropriate to reflect parse quality.
Evaluating phonological status: significance of paradigm uniformity vs. prosodic grouping effects
(2007)
A central concern of linguistic phonetics is to define criteria for determining the phonological status of sounds or sound properties observed in phonetic surface form. Based on acoustic measurements we show that the occurrence of syllabic sonorants vs. schwa-sonorant sequences in German is determined exclusively by segmental and prosodic structure, with no paradigm uniformity effects. We argue that these findings are consistent with a uniform representation of syllabic sonorants as schwa sonorant sequences in the lexicon. The stability of schwa in CVC-suffixes (e.g. the German diminutive suffix -chen), as opposed to its phonetic absence in a segmentally comparable underived context, is argued to be conditioned by the prosodic organisation of such suffixes external to the phonological word of the stem.
Incompatibility (or co-hyponymy) is the most general type of semantic relation between lexical items, the meaning of which entails exclusion. Such items fall under a superordinate term or concept and denote sets which have no members in common (e.g. animal: dog-cat-mouse-lion-sheep; example from Cruse 2004). Traditionally, these have been of interest to lexical semanticists for the description of the structure of the lexicon. However, incompatibility is not just a relation that signifies a difference of meaning. This paper is a critical corpus-assisted re-evaluation of the phenomenon of incompatibility which argues that the relation in question sometimes also functions as a discourse marker. Incompatibles indicate recurrent intertextual patterns. This holds particularly true for socially or politically controversial lexical items such as Flexibilität (flexibility), Mobilität (mobility) or Globalisierung (globalisation). Corpus investigations of such words have revealed that among other semantically related terms, incompatibles have a crucial discourse focussing function. For the German lexical item Globalisierung, I will show how its lexical usage can be studied through a corpus-driven analysis of corresponding incompatibles. Incompatible terms are not contingent co-words but often occur in close contextual proximity and participate in regular syntagmatic structures (e.g. Globalisierung und Rationalisierung; Globalisierung und Modernisierung; Neoliberalismus, Globalisierung und Kapitalismus). Hence, these are easily extracted by conducting a computational collocation analysis. Such significant collocates provide a good insight into the discursive and thematic contexts of the search word. Following Teubert (2004), I will demonstrate how the meaning of such lexical items is constituted in discourse and how the examination of these particular collocates reveals their sense-constructing function and their pragmatic-discursive force. I will provide a brief discussion of the methodology used for such analyses, and I will explain why the complex semantic-pragmatic and thematic-communicative patterns implied in sets of incompatibles should be given a stronger emphasis in lexicography.
We present an XML-based metadata standard for the documentation of speech and multimedia corpora that was developed at the Institute for German Language (IDS) in Mannheim, Germany. The IDS is one of the major institutions providing German speech and language corpora to researchers. These corpora stem from many different sources and were previously documented in a rather heterogeneous fashion using a variety of data models and formats. In order to unify the documentation for existing and future corpora, the IDS- internal Archive for Spoken German collaborated with several projects and developed a set of standardised XML metadata schemas. These XML schemas build on existing internal and external documentation schemas (such as IMDI) and take into account the workflow of speech corpus production. In order to minimise redundancy, separate schemas were designed for projects, speakers, recording sessions, and entire corpora. The resulting schemas are tested in ongoing speech and multi-media projects at the IDS and are regularly revised. They are accompanied by element definitions, guidelines, and examples. In addition, a mapping to IMDI will be provided.
Trubetzkoy's recognition of a delimitative function of phonology, serving to signal boundaries between morphological units, is expressed in terms of alignment constraints in Optimality Theory, where the relevant constraints require specific morphological boundaries to coincide with phonological structure (Trubetzkoy 1936, 1939, McCarthy & Prince 1993). The approach pursued in the present article is to investigate the distribution of phonological boundary signals to gain insight into the criteria underlying morphological analysis. The evidence from English and Swedish suggests that necessary and sufficient conditions for word-internal morphological analysis concern the recognizability of head constituents, which include the rightmost members of compounds and head affixes. The claim is that the stability of word-internal boundary effects in historical perspective cannot in general be sufficiently explained in terms of memorization and imitation of phonological word form. Rather, these effects indicate a morphological parsing mechanism based on the recognition of word-internal head constituents. Head affixes can be shown to contrast systematically with modifying affixes with respect to syntactic function, semantic content, and prosodic properties. That is, head affixes, which cannot be omitted, often lack inherent meaning and have relatively unmarked boundaries, which can be obscured entirely under specific phonological conditions. By contrast, modifying affixes, which can be omitted, consistently have inherent meaning and have stronger boundaries, which resist prosodic fusion in all phonological contexts. While these correlations are hardly specific to English and Swedish it remains to be investigated to which extent they hold cross-linguistically. The observation that some of the constituents identified on the basis of prosodic evidence lack inherent meaning raises the issue of compositionality. I will argue that certain systematic aspects of word meaning cannot be captured with reference to the syntagmatic level, but require reference to the paradigmatic level instead. The assumption is then that there are two dimensions of morphological analysis: syntagmatic analysis, which centers on the criteria for decomposing words in terms of labelled constituents, and paradigmatic analysis, which centers on the criteria for establishing relations among (whole) words in the mental lexicon. While meaning is intrinsically connected with paradigmatic analysis (e.g. base relations, oppositeness) it is not essential to syntagmatic analysis.