410 Linguistik
Refine
Year of publication
- 2016 (23) (remove)
Document Type
- Part of a Book (10)
- Conference Proceeding (9)
- Article (2)
- Book (1)
- Doctoral Thesis (1)
Keywords
- Deutsch (7)
- Korpus <Linguistik> (7)
- Forschungsmethode (3)
- Gesprochene Sprache (3)
- Annotation (2)
- Automatische Sprachanalyse (2)
- Englisch (2)
- Kommunikation (2)
- Konflikt (2)
- Polnisch (2)
Publicationstate
Reviewstate
Publisher
The compilation of terminological vocabularies plays a central role in the organization and retrieval of scientific texts. Both simple keyword lists as well as sophisticated modellings of relationships between terminological concepts can make a most valuable contribution to the analysis, classification, and finding of appropriate digital documents, either on the Web or within local repositories. This seems especially true for long-established scientific fields with various theoretical and historical branches, such as linguistics, where the use of terminology within documents from different origins is sometimes far from being consistent. In this short paper, we report on the early stages of a project that aims at the re-design of an existing domain-specific KOS for grammatical content grammis. In particular, we deal with the terminological part of grammis and present the state-of-the-art of this online resource as well as the key re-design principles. Further, we propose questions regarding ramifications of the Linked Open Data and Semantic Web approaches for our re-design decisions.
In this paper, we describe preliminary results from an ongoing experiment wherein we classify two large unstructured text corpora—a web corpus and a newspaper corpus—by topic domain (or subject area). Our primary goal is to develop a method that allows for the reliable annotation of large crawled web corpora with meta data required by many corpus linguists. We are especially interested in designing an annotation scheme whose categories are both intuitively interpretable by linguists and firmly rooted in the distribution of lexical material in the documents. Since we use data from a web corpus and a more traditional corpus, we also contribute to the important field of corpus comparison and corpus evaluation. Technically, we use (unsupervised) topic modeling to automatically induce topic distributions over gold standard corpora that were manually annotated for 13 coarse-grained topic domains. In a second step, we apply supervised machine learning to learn the manually annotated topic domains using the previously induced topics as features. We achieve around 70% accuracy in 10-fold cross validations. An analysis of the errors clearly indicates, however, that a revised classification scheme and larger gold standard corpora will likely lead to a substantial increase in accuracy.
This paper introduces the recently started DRuKoLA-project that aims at providing mechanisms to flexibly draw virtual comparable corpora from the German Reference Corpus DeReKo and the Reference Corpus of Contemporary Romanian Language CoRoLa in order to use these virtual corpora as empirical basis for contrastive linguistic research.
Linguistische Zugänge zu Konflikten in europäischen Sprachräumen. Korpus - Pragmatik - kontrovers
(2016)
Konflikte begleiten das soziale Leben in unserer Gesellschaft: Vom Gartenzaun bis in die politischen Arenen, vom Alltag bis hin zu Fragen der transnationalen Verrechtlichung in der Europäischen Union – überall begegnen uns tagtäglich Auseinandersetzungen. Konflikte und Sprache hängen dabei eng miteinander zusammen. Zum einen wird in Sprache über Sprache verhandelt, zum anderen ist Sprache das Medium des Streitens und Versöhnens schlechthin. Konflikte werden vor allem durch Sprache vermittelt, d.h. Sprach(en)konflikte sind Spiegel soziokultureller Auseinandersetzungen um Wissen und Macht.
Der Band bietet einen umfassenden Einblick in die kontroverse Diskussion und Weiterentwicklung aktueller linguistischer Forschung zur Untersuchung von Konflikten. Gerade in Zeiten von gesellschaftlichen Krisen können sprachwissenschaftliche Ansätze dazu beitragen, Konflikte als sozialsymbolische Handlungsmuster zu analysieren und ihre kommunikativen Zusammenhänge zu beschreiben.
Der Aufsatz knüpft an die Diskussion zur Verwendung von formalen grammatischen Kategorien im Sprachvergleich an (vgl. insbesondere Haspelmath 2007, 2010a, b und Newmeyer 2007, 2010). Es wird dabei nicht danach gefragt, ob sprachübergreifende grammatische Kategorien (oder genauer gesagt Kategorienausprägungen) existieren oder nicht bzw. ob einzelsprachliche grammatische Kategorien im Sprachvergleich sinnvoll einsetzbar sind, sondern wie ähnlich bzw. unterschiedlich einzelsprachliche Kategorien bzw. Kategorisierungen sind. Das Ziel ist damit, eine Methode zur Messung des Äquivalenzgrades von grammatischen Kategorien in verschiedenen Sprachen zu präsentieren; dies wird am Beispiel des IMPERATIVS im Deutschen, Englischen, Polnischen und Tschechischen illustriert.
On the basis of a law text corpus which consists of judicial decisions and jurisprudential papers on so-called assisted suicide from 1977 to 2011, agonal centres are determined within the paradigm of corpus-based pragma-semiotic text analysis. Agonal centres are defined as action-guiding concepts that are in conflict with each other concerning the general acceptance of event interpretations, options for actions, claims of validity, contextual knowledge and values. These action-guiding concepts are derived with the help of quantitative and qualitative methods. Discourse linguistic interpretations are thus rendered more objective with the help of semi-automatic methods; furthermore, specific discourse features of the discourse and approaches to interpretation can be derived from (un)expected linguistic significances of occurrence, distribution, frequency etc. at the linguistic surface. Finally, these agonal centres specific to the language of law are compared to agonal centres which are determined on the basis of a media corpus on the same issue. This provides a comparative insight into the constitution of a seemingly identical fact in everyday and special language, which demonstrates the sociopolitical relevance of analysing the constitution of reality as instructed by language.