Refine
Document Type
- Conference Proceeding (17)
- Part of a Book (10)
- Working Paper (3)
- Book (2)
- Article (1)
Language
- English (33) (remove)
Keywords
- Korpus <Linguistik> (19)
- Corpus linguistics (15)
- Corpus technology (12)
- Englisch (7)
- Large corpora (7)
- Annotation (6)
- Corpus annotation (6)
- Datenbanksystem (6)
- Corpus management (5)
- Internet (5)
Publicationstate
Reviewstate
- Peer-Review (20)
- (Verlags)-Lektorat (2)
- Verlags-Lektorat (1)
Publisher
- Institut für Deutsche Sprache (33) (remove)
Who understands Low German today and who can speak it? Who makes use of media and cultural events in Low German? What images do people in northern Germany associate with Low German and what is their view of their regional language?
These and further questions are answered in this brochure with the help of representative data collected in a telephone survey of a total of 1,632 people from eight federal states (Bremen, Hamburg, Lower Saxony, Mecklenburg-West Pomerania and Schleswig-Holstein as well as Brandenburg, North Rhine-Westphalia and Saxony-Anhalt).
Our paper describes an experiment aimed to assessment of lexical coverage in web corpora in comparison with the traditional ones for two closely related Slavic languages from the lexicographers’ perspective. The preliminary results show that web corpora should not be considered ― inferior, but rather ― different.
With an increasing amount of text data available it is possible to automatically extract a variety of information about language. One way to obtain knowledge about subtle relations and analogies between words is to observe words which are used in the same context. Recently, Mikolov et al. proposed a method to efficiently compute Euclidean word representations which seem to capture subtle relations and analogies between words in the English language. We demonstrate that this method also captures analogies in the German language. Furthermore, we show that we can transfer information extracted from large non-annotated corpora into small annotated corpora, which are then, in turn, used for training NLP systems.
Many (modernist) works of literature can be understood by their associativeness, be it constructed or “free”. This network-like character of (modernist) literature has often been addressed by terms like “free association”, connotation”, “context” or “intertext”. This paper proposes an experimental and exemplary approach to intraconnect a literary corpus of the Austrian writer Ilse Aichinger with semantic web-technologies to enable interactive explorations of word-associations.
The IMS Open Corpus Workbench (CWB) software currently uses a simple tabular data model with proven limitations. We outline and justify the need for a new data model to underlie the next major version of CWB. This data model, dubbed Ziggurat, defines a series of types of data layer to represent different structures and relations within an annotated corpus; each such layer may contain variables of different types. Ziggurat will allow us to gradually extend and enhance CWB’s existing CQP-syntax for corpus queries, and also make possible more radical departures relative not only to the current version of CWB but also to other contemporary corpus-analysis software.
One of the specific historical and cultural characteristics of the Russian political discourse is its orientation to precedents. It is considered correct to follow the behaviouristic models shown by one of the “heroes” (Peter I, Lenin, Stalin, etc.), to reproduce standard texts, and to compare the present situations with past situations (The Time of Troubles, Weimar Republic, NEP “New Economic Policy” (1921-1928), etc.). One of the peculiarities of the present time in Russia is the deep conflict between different social groups orientated to different precedents. Each group has its own variant of the national myth using the same means of the language for actualisation of this myth. Therefore, it is very important to analyse changes in the national cognitive foundation. Precedential phenomena are the central components of this foundation.