Refine
Document Type
- Article (4)
- Part of a Book (3)
Has Fulltext
- yes (7)
Keywords
- Diskurs (2)
- Diskursanalyse (2)
- Kommunikation (2)
- Korpus <Linguistik> (2)
- Adjektiv (1)
- Algorithmus (1)
- Analytischen Datenerschließung (1)
- Annotation (1)
- COVID-19 (1)
- Datenerhebung (1)
Publicationstate
Reviewstate
- (Verlags)-Lektorat (3)
- Peer-Review (1)
- Peer-review (1)
Gegenstand ist eine vergleichende empirische Korpusstudie zur Bedeutung des Ausdrucks geschäftsmäßig im (bundesdeutschen) Gemeinsprach- und juristischen Fachsprachgebrauch. Die Studie illustriert an einem aktuellen Fall strittiger Wortdeutung (hier zu § 217 StGB) die Möglichkeiten computergestützter Sprachgebrauchsanalyse für die Auslegung vor Gericht und die Normtextprognose in der Rechtsetzung.
The International Comparable Corpus (ICC) (Kirk/Čermáková 2017; Čermáková et al. 2021) is an open initiative which aims to improve the empirical basis for contrastive linguistics by compiling comparable corpora for many languages and making them as freely available as possible as well as providing tools with which they can easily be queried and analysed. In this contribution we present the first release of written language parts of the ICC which includes corpora for Chinese, Czech, English, German, Irish (partly), and Norwegian. Each of the released corpora contains 400k words distributed over 14 different text categories according to the ICC specifications. Our poster covers the design basics of the ICC, its TEI encoding, a demonstration of using the ICC via different query tools, and an outlook on future plans.
Similar to the European Reference Corpus EuReCo (Kupietz et al. 2020), ICC follows the approach of reusing existing linguistic resources wherever possible in order to cover as many languages as possible with realistic effort in as short a time as possible. In contrast to EuReCo, however, comparable corpus pairs are not defined dynamically in the usage phase, but the compositions of the corpora are fixed in the ICC design. The approaches are thus complementary in this respect. The design principles and composition of the ICC are based on those of the International Corpus of English (ICE) (Greenbaum (ed.) 1996), with the deviation that the ICC includes the additional text category blog post and excludes spoken legal texts (see Čermáková et al. 2021 for details). ICC’s fixed-design approach has the advantage that all single-language corpora in the ICC have the same composition with respect to the selected text types and that this guarantees that the selected broad spectrum of potential influencing variables for linguistic variation is always represented. The disadvantage, however, is that this can only be achieved for quite small corpora and that the generalisability of comparative findings based on the ICC corpora will often need to be checked on larger monolingual corpora or translation corpora (Čermáková/Ebeling/Oksefjell Ebeling forthcoming). Arguing that such issues with comparability and representativeness are inevitable, in one way or the other, and need to be dealt with, our poster will discuss and exemplify the text selections in more detail.
On the basis of a law text corpus which consists of judicial decisions and jurisprudential papers on so-called assisted suicide from 1977 to 2011, agonal centres are determined within the paradigm of corpus-based pragma-semiotic text analysis. Agonal centres are defined as action-guiding concepts that are in conflict with each other concerning the general acceptance of event interpretations, options for actions, claims of validity, contextual knowledge and values. These action-guiding concepts are derived with the help of quantitative and qualitative methods. Discourse linguistic interpretations are thus rendered more objective with the help of semi-automatic methods; furthermore, specific discourse features of the discourse and approaches to interpretation can be derived from (un)expected linguistic significances of occurrence, distribution, frequency etc. at the linguistic surface. Finally, these agonal centres specific to the language of law are compared to agonal centres which are determined on the basis of a media corpus on the same issue. This provides a comparative insight into the constitution of a seemingly identical fact in everyday and special language, which demonstrates the sociopolitical relevance of analysing the constitution of reality as instructed by language.
Fragen der Verdatung sind Bestandteil der digitalen Diskursanalyse und keine Vorarbeiten. Die Analyse digital(isiert)er Diskurse setzt im Unterschied zur Auswertung nicht-digital repräsentierter Sprache und Kommunikation notwendig technische Verfahren und Praktiken, Algorithmen und Software voraus, die den Untersuchungsgegenstand als digitales Datum konstituieren. Die nachfolgenden Abschnitte beschreiben kurz und knapp wiederkehrende Aspekte dieser Verdatungstechniken und -praktiken, insbesondere mit Blick auf Erhebung und Transformation (Abschnitt 2), Korpuskompilierung (Abschnitt 3), Annotation (Abschnitt 4) und Wege der analytischen Datenerschließung (Abschnitt 5). Im Fazit wird die Relevanz der Verdatungsarbeit für den Analyseprozess zusammengefasst (6).