Korpuslinguistik
Refine
Document Type
- Conference Proceeding (3) (remove)
Has Fulltext
- yes (3)
Is part of the Bibliography
- no (3)
Keywords
Publicationstate
Publisher
- Bozen University Press (1)
- ELRA (1)
- University of Liverpool (1)
Empirical synchronic language studies generally seek to investigate language phenomena for one point in time, even though this point in time is often not stated explicitly. Until today, surprisingly little research has addressed the implications of this time-dependency of synchronic research on the composition and analysis of data that are suitable for conducting such studies. Existing solutions and practices tend to be too general to meet the needs of all kinds of research questions. In this theoretical paper that is targeted at both corpus creators and corpus users, we propose to take a decidedly synchronic perspective on the relevant language data. Such a perspective may be realised either in terms of sampling criteria or in terms of analytical methods applied to the data. As a general approach for both realisations, we introduce and explore the FReD strategy (Frequency Relevance Decay) which models the relevance of language events from a synchronic perspective. This general strategy represents a whole family of synchronic perspectives that may be customised to meet the requirements imposed by the specific research questions and language domain under investigation.
Der Beitrag betrachtet lexikalisch-semantische Relationen aus einer emergentistischen Perspektive vor dem Hintergrund eines korpusgeleiteten empirisch-linguistischen Ansatzes. Er skizziert, wie eine systematische Erfassung und Auswertung des Kookkurrenzverhaltens von Lexemen – die Analyse der Ahnlichkeit von Kookkurrenzprofilen mit Hilfe von selbstorganisierenden lexikalischen Merkmalskarten und ihre im Diskurs verankerte Interpretation – wichtige Einblicke in die Struktur verschiedenartiger Verwendungsaspekte dieser Lexeme einschlieslich ihrer semantischen Nahe ermoglichen. Die vorgestellte Methodik wird dabei –uber die explorativ-analytischen Zielsetzungen hinaus – als eine abduktive, auf Theoriebildung zielende Generalisierungsstrategie im postulierten Lexikon-Syntax-Kontinuum verstanden. Zum Schluss werden die Anwendungsmoglichkeiten einiger Komponenten dieser Methodik in der Lexikografie, Lexikologie und Didaktik diskutiert.
^This paper describes DeReKo (Deutsches Referenzkorpus), the Archive of General Reference Corpora of Contemporary Written German at the Institut für Deutsche Sprache (IDS) in Mannheim, and the rationale behind its development. We discuss its design, its legal background, how to access it, available metadata, linguistic annotation layers, underlying standards, ongoing developments, and aspects of using the archive for empirical linguistic research. The focus of the paper is on the advantages of DEREKO’s design as a primordial sample from which virtual corpora can be drawn for the specific purposes of individual studies. Both concepts, primordial sample and virtual corpus are explained and illustrated in detail. Furthermore, we describe in more detail how DEREKO deals with the fact that all its texts are subject to third parties’ intellectual property rights, and how it deals with the issue of replicability, which is particularly challenging given DEREKO’s dynamic growth and the possibility to construct from it an open number of virtual corpora.