OPUS 4 | Search

The International Comparable Corpus: Challenges in building multilingual spoken and written comparable corpora (2021)

Čermáková, Anna ; Jantunen, Jarmo ; Jauhiainen, Tommi ; Kirk, John ; Křen, Michal ; Kupietz, Marc ; Uí Dhonnchadha, Elaine

This paper reports on the efforts of twelve national teams in building the International Comparable Corpus (ICC; https://korpus.cz/icc) that will contain highly comparable datasets of spoken, written and electronic registers. The languages currently covered are Czech, Finnish, French, German, Irish, Italian, Norwegian, Polish, Slovak, Swedish and, more recently, Chinese, as well as English, which is considered to be the pivot language. The goal of the project is to provide much-needed data for contrastive corpus-based linguistics. The ICC corpus is committed to the idea of re-using existing multilingual resources as much as possible and the design is modelled, with various adjustments, on the International Corpus of English (ICE). As such, ICC will contain approximately the same balance of forty percent of written language and 60 percent of spoken language distributed across 27 different text types and contexts. A number of issues encountered by the project teams are discussed, ranging from copyright and data sustainability to technical advances in data distribution.

Rule talk: Instructing proper play with impersonal deontic statements (2021)

Zinken, Jörg ; Kaiser, Julia ; Weidner, Matylda ; Mondada, Lorenza ; Rossi, Giovanni ; Sorjonen, Marja-Leena

The present paper explores how rules are enforced and talked about in everyday life. Drawing on a corpus of board game recordings across European languages, we identify a sequential and praxeological context for rule talk. After a game rule is breached, a participant enforces proper play and then formulates a rule with an impersonal deontic statement (e.g. “It’s not allowed to do this”). Impersonal deontic statements express what may or may not be done without tying the obligation to a particular individual. Our analysis shows that such statements are used as part of multi-unit and multi-modal turns where rule talk is accomplished through both grammatical and embodied means. Impersonal deontic statements serve multiple interactional goals: they account for having changed another’s behavior in the moment and at the same time impart knowledge for the future. We refer to this complex action as an “instruction.” The results of this study advance our understanding of rules and rule-following in everyday life, and of how resources of language and the body are combined to enforce and formulate rules.

Anglizismen in der Coronakrise (2021)

Zifonun, Gisela

Eine Linguistin denkt nach über den Genderstern (2021)

Zifonun, Gisela

Das Deutsche als europäische Sprache: Ein Porträt (2021)

Zifonun, Gisela

Das Deutsche ist eine der am besten erforschten Sprachen der Welt; weniger bekannt ist, welche Gemeinsamkeiten es mit den europäischen Nachbarsprachen teilt und wo seine Besonderheiten liegen. Die insgesamt acht Kapitel des Buches stellen prägnant und anhand von anschaulichen Beispielen Wortschatz und Grammatik des Deutschen vor. Dabei verhilft ein Vergleich mit den Optionen etwa im Englischen, Französischen, Polnischen, Ungarischen oder anderen europäischen Sprachen zu einem verschärften Blick. Ausgangspunkt ist dabei ein kurzer Abriss der Facetten von Sprache allgemein sowie die Herleitung der grundlegenden Sprachfunktionen aus einer handlungsbezogenen Perspektive. Die folgenden Kapitel stehen unter Motti wie: „Das Verb – Zeiten, Modi, Szenarios und Inszenierungen“, „Der nominale Bereich – die vielerlei Arten, Gegenstände zu konstruieren“ oder „Der Text – wenn wir kohärent und dabei narrativ oder argumentativ werden“. Das letzte Kapitel trägt den Titel: „Das Deutsche – auf dem Weg zu einem Sprachporträt“. Das Buch soll Sprachinteressierten auch ohne linguistische Fachkenntnisse einen neuen Zugang zu unserer Muttersprache erschließen und die Sensibilität für die sprachliche Verbundenheit auf unserem Kontinent trotz aller Vielfalt stärken. - Grammatik anschaulich und konkret - Innovativer Blick auf das Deutsche im Kreis europäischer Sprachen - Kurzweilige Einführung für Sprachinteressierte auch ohne linguistische Fachkenntnisse

Vorwort (2021)

Ziegler, Evelyn ; Marten, Heiko F.

Linguistic Landscapes in deutschsprachigen Kontexten (2021)

Ziegler, Evelyn ; Marten, Heiko F.

This chapter starts out by giving a brief overview of the main priorities of international and German studies in the area of linguistic landscape research. The contributions to this volume are then embedded in current debates and developments in the field. Finally, we outline important desiderata of linguistic landscape research that focus on German and address challenges of knowledge transfer and application as well as possible contributions to international lines of research.

Einleitung (2021)

Wöllstein, Angelika

Mit dem zweiten Band werden vier neue „Bausteine“ zu einer korpuslinguistisch fundierten Grammatik des Deutschen vorgelegt. Sie behandeln die Bereiche Determination, syntaktische Funktionen der Nominalphrase und Attribution. Dem Fachpublikum werden zugleich die analysierten Sprachdaten und vertiefende Zusatzuntersuchungen zugänglich gemacht.

Toilettenpapier im April, Mutationen im Dezember: Einflüsse der Corona-Pandemie auf die deutsche Sprache (2021)

Wolfer, Sascha

Am 24. Februar 2020 wurde in der Schweiz die erste Infektion mit dem Coronavirus nachgewiesen. Zu diesem Zeitpunkt konnte wohl noch niemand ahnen, welche tiefgreifenden Konsequenzen die Corona-Pandemie für die Gesellschaft haben wird. Aus heutiger Perspektive überrascht es uns nicht mehr, dass das Pandemiegeschehen auch starke Auswirkungen auf die Sprache hatte und noch immer hat, denn Sprachgebrauch passt sich stets gesellschaftlichen Veränderungen an. Am Leibniz-Institut für Deutsche Sprache in Mannheim dokumentieren und erforschen wir die ungewöhnlich starken und kurzfristigen Wirkungen der Pandemie auf die deutsche Sprache und fassen unsere Ergebnisse unter anderem in zahlreichen Beiträgen zusammen.

Implicitly abusive language – What does it actually look like and why are we not getting there? (2021)

Wiegand, Michael ; Ruppenhofer, Josef ; Eder, Elisabeth

Abusive language detection is an emerging field in natural language processing which has received a large amount of attention recently. Still the success of automatic detection is limited. Particularly, the detection of implicitly abusive language, i.e. abusive language that is not conveyed by abusive words (e.g. dumbass or scum), is not working well. In this position paper, we explain why existing datasets make learning implicit abuse difficult and what needs to be changed in the design of such datasets. Arguing for a divide-and-conquer strategy, we present a list of subtypes of implicitly abusive language and formulate research tasks and questions for future research.

Exploiting emojis for abusive language detection (2021)

Wiegand, Michael ; Ruppenhofer, Josef

We propose to use abusive emojis, such as the “middle finger” or “face vomiting”, as a proxy for learning a lexicon of abusive words. Since it represents extralinguistic information, a single emoji can co-occur with different forms of explicitly abusive utterances. We show that our approach generates a lexicon that offers the same performance in cross-domain classification of abusive microposts as the most advanced lexicon induction method. Such an approach, in contrast, is dependent on manually annotated seed words and expensive lexical resources for bootstrapping (e.g. WordNet). We demonstrate that the same emojis can also be effectively used in languages other than English. Finally, we also show that emojis can be exploited for classifying mentions of ambiguous words, such as “fuck” and “bitch”, into generally abusive and just profane usages.

Implicitly abusive comparisons – a new dataset and linguistic analysis (2021)

Wiegand, Michael ; Geulig, Maja ; Ruppenhofer, Josef

We examine the task of detecting implicitly abusive comparisons (e.g. “Your hair looks like you have been electrocuted”). Implicitly abusive comparisons are abusive comparisons in which abusive words (e.g. “dumbass” or “scum”) are absent. We detail the process of creating a novel dataset for this task via crowdsourcing that includes several measures to obtain a sufficiently representative and unbiased set of comparisons. We also present classification experiments that include a range of linguistic features that help us better understand the mechanisms underlying abusive comparisons.

CLARIAH-DE in der digitalen Lehre (2021)

Werthmann, Antonina ; Witt, Andreas ; Bock, Sina ; Jannidis, Fotis

Die durch die Covid-19-Pandemie bedingte Umstellung der Präsenzlehre auf digitale Lehr- und Lernformate stellte Lehrende und Studierende gleichermaßen vor eine Herausforderung. Innerhalb kürzester Zeit musste die Nutzung von Plattformen und digitalen Tools erlernt und getestet werden. Der Beitrag stellt exemplarisch Dienste und Werkzeuge von CLARIAH-DE vor und erläutert, wie die digitale Forschungsinfrastruktur Lehrende und Studierende auch im Rahmen der digitalen Lehre unterstützen kann.

Eigen- und Fremdcharakterisierung literarischer Figuren untersucht mit Sentimentanalyse (2021)

Weimer, Lukas ; Brunner, Annelen

Darstellung erster Untersuchungsergebnisse zur Eigen- und Fremdcharakterisierung literarischer Figuren mit Sentimentanalyse auf der Konferenz vDHd 2021.

Determination in der Nominalphrase – ein Überblick (2021)

Weber, Thilo

Dieses Kapitel gibt einen Überblick über das Inventar der Ausdrücke, die zur Kategorie der Determinierer gezählt werden bzw. zumindest als Kandidaten für diese Kategorie gehandelt werden. Es untersucht ihre grammatischen Eigenschaften und überprüft ihren Determiniererstatus anhand einschlägiger morpho-syntaktischer Kriterien.

Syntaktische Funktionen von Nominalphrasen und Funktionen der Kasus (2021)

Weber, Thilo

Dieses Kapitel untersucht die syntaktischen Funktionen von vollen (nicht-pronominalen) Nominalphrasen (NPs) und die Funktionen der vier Kasus des Deutschen aus quantitativer Perspektive. Es wird vorgeschlagen, das Konzept der syntaktischen Funktion in grundlegendere Merkmale zu zerlegen. Dazu gehören der Typ desjenigen Elements, dem die NP untergeordnet ist, und die Art der Beziehung zwischen der NP und dem übergeordneten Element (ganz allgemein: Komplementation vs. Modifikation).

Datensatz Nominalphrasen (2021)

Weber, Thilo

Der Datensatz Nominalphrasen enthält Belege zu nichtpronominalen (d.h. vollen, lexikalischen) Nominalphrasen (NPs) mit einem Substantiv oder einer Nominalisierung als Kopf. Jeder Beleg ist in Bezug auf eine Reihe linguistisch relevanter Merkmale annotiert. Insgesamt enthält der Datensatz 8.137 Belegstellen. Nach dem Aussortieren von Fehlbelegen (siehe Spalten „valide“ und „nicht-valide_Begründung“) bleiben noch 7.813 einschlägige Belege. Die Suchanfrage erfolgte über das Kopfnomen; für Details zur Datenerhebung siehe Weber (2021a). Das Kopfnomen erscheint in der Spalte „Kopf_der_NP“. In manchen Fällen besteht die NP nur aus dem Kopfnomen, in den meisten Fällen geht sie aber darüber hinaus; sie erstreckt sich dann auf einen Teil des vorangehenden Kontexts (Spalte „Satzkontext_vor_Beleg“) und/oder des nachfolgenden Kontexts („Satzkontext_nach_Beleg“). Der Datensatz dient der Untersuchung der syntaktischen Funktionen von NPs (Weber 2021a) und der Determination in der NP (Weber 2021b).

Im Übrigen habe ich schon länger kein Müsli mehr gegessen. Weil: Ich frühstücke in der Regel nie – Verbstellung nach weil, obwohl, wobei und anderen subordinierenden Konnektoren (Aus: Grammatik in Fragen und Antworten) (2021)

Waßner, Ulrich

CLARIAH-DE work package 5 - community engagement: outreach/dissemination and liaison (2021)

Walker, Nathalie ; Werthmann, Antonina ; Trippel, Thorsten ; Buddenbohm, Stefan ; Weimer, Lukas ; Friedrichs, Sonja

This poster summarizes the results of the CLARIAH-DE Work Package 5 - Community Engagement: Outreach/Dissemination and Liaison. Work package 5 engages with the community through dissemination activities, outreach and liaison. The work package set itself the following sub goals: - Combining the existing dissemination and outreach activities of CLARIN-D and DARIAH-DE in a meaningful way and elaborating on them. In some cases this meant continuity, in other cases a new appearance for resources. - Providing a web portal as a gateway to the CLARIAH-DE project. - Creating a common identity and corporate identity and maintaining the established level of trust users already put into CLARIN-D and DARIAH-DE. - Providing a social media presence as well as a physical presence at workshops, conferences and other meetings in the Digital Humanities.

Validating the Performativity Hypothesis to Neg-Raising using corpus data: Evidence from Polish (2021)

Trawiński, Beata

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

205 search hits