Refine
Year of publication
Document Type
- Part of a Book (8)
- Article (5)
- Conference Proceeding (2)
- Review (1)
Has Fulltext
- yes (16)
Keywords
- Wikipedia (16) (remove)
Publicationstate
Reviewstate
- Peer-Review (10)
- (Verlags)-Lektorat (4)
Publisher
- de Gruyter (3)
- IDS-Verlag (2)
- Institut für Deutsche Sprache (2)
- Leibniz-Institut für Deutsche Sprache (2)
- Erich Schmidt (1)
- European Language Resources Association (ELRA) (1)
- Graphen & Netzwerke; AG des Verbandes Digital Humanities im deutschsprachigen Raum e.V. (1)
- Leibniz-Institut für Deutsche Sprache (IDS) (1)
- Universität Hamburg (1)
Funktionsverbgefüge stehen seit jeher in der Sprachkritik, die sich nun auch auf digitale Räume ausbreitet. Vertreten wird dort die These, Funktionsverbgefüge und ihre entsprechenden Basisverben seien äquivalent und könnten in allen Kontexten durch die verbalen Entsprechungen ersetzt werden. Dies kann durch die vorliegende korpusbasierte und textlinguistische Studie am Beispiel des Gefüges Frage stellen widerlegt werden. Anhand eines extensiven Datenmaterials aus den Wikipedia-Artikel-Korpora des IDS zeige ich die semantischen, grammatischen und textlinguistischen Unterschiede zwischen dem Basisverb und dem Funktionsverbgefüge im Gebrauch auf, die sich in der Anreicherung, Verdichtung, Perspektivierung, Gewichtung und Wiederaufnahme von Informationen im Text manifestieren.
Contrastive analysis of climate-related neologisms registered in GermanN and French Wikipedia
(2023)
Neologisms represent new social norms, tendencies, controversies and attitudes. They denote new or changed concepts which are constantly being negotiated between different members of the discourse community (Wodak 2022 and Catalano/Waugh (eds.) 2020). Neologisms help to identify new communicative patterns and narratives which illustrate different strings of discourse in everyday life. In recent years, many neologisms relating to the subject of the environment and climate have been emerging around the world mainly due to dominant discussions on climate change and the movement “Fridays for Future”. In German, for example, neologisms such as Klimakleber, klimaresilient and globaler Streik and in French neologisms such as éco-anxiété, justice climatique and écocitoyen could be observed. These neologisms occur in many domains of life, for example in politics, media and also in advertising, which means that “l’importance croissante des enjeux environnementaux dans les discours politiques, médiatiques et publicitaires” (Balnat/Gérard 2022, p. 22) can be identified. However, it is not only the occurrence of environment- or climate-related topics that is increasing, but also the rising polarisation of the public debate. The polarisation within public discourse is based on the fact that there are opposing positions which are represented by new or recently relevant terms such as activistes du climat (or Klimaaktivisten) and climatosceptiques (or Klimaskeptiker) (Balnat/Gérard 2022, p. 22). Due to different identifications with one or the other side, one can also speak of an “affrontement idéologique” (Balnat/Gérard 2022, p. 23). 1 The explosive nature and the high complexity of the debate on climate and the environmental issues mean that many words are naturally unfamiliar to people. This is especially true with regard to neologisms. In addition, it is often not only the new word itself but also the signified concept that is initially unknown. When people then look up words, they often do so on the Internet. Wikipedia as a “free encyclopedia” (Wikipedia 2023) is particularly well suited as an object of study with regard to neologisms, since factual knowledge is given special attention there. Furthermore, this reference guide is perceived as a regular source of agreed and common knowledge on all sorts of subjects. Hence, the descriptions found here represent social agreement on controversial terms and discussions to some degree. In this paper, German and French neologisms from the subject area of climate and environment will be examined primarily in Wikipedia, but also in the neighbouring resource Wiktionary,2 which is “a collaborative project to produce a free-content multilingual dictionary” (Wiktionary 2023). Since Wikipedia and Wiktionary are available in French and in German, 21010. International Contrastive Linguistics Conference (ICLC) both are equally suitable for the contrastive analysis. Thus, Wikipedia articles which are accessible in both languages (e.g. Klimanotstand and État d›urgence climatique) or Wikipedia articles about similar events and phenomena (e.g. Letzte Generation and Dernière Rénovation) will be compared. For example, we will have a closer look at other new terms specifying different thematic aspects of the discourse of climate and environment. We will mainly refer to those lexical items which can be found in the respective articles in both languages. Special emphasis will be on overlaps and differences, thematic foci, speaker’s positions and evaluative terms.
This paper presents an extended annotation and analysis of interpretative reply relations focusing on a comparison of reply relation types and targets between conflictual pages and neutral pages of German Wikipedia (WP) talk pages. We briefly present the different categories identified for interpretative reply relations to analyze the relationship between WP postings as well as linguistic cues for each category. We investigate referencing strategies of WP authors in discussion page postings, illustrated by means of reply relation types and targets taking into account the degree of disagreement displayed on a WP talk page. We provide richly annotated data that can be used for further analyses such as the identification of interactional relations on higher levels, or for training tasks in machine learning algorithms.
Learning from students. On the design and usability of an e-dictionary of mathematical graph theory
(2022)
We created a prototype of an electronic dictionary for the mathematical domain of graph theory. We evaluate our prototype and compare its effectiveness in task-based tests with that of Wikipedia. Our dictionary is based on a corpus; the terms and their definitions were automatically extracted and annotated by experts (cf. Kruse/Heid 2020). The dictionary is bilingual, covering German and English; it gives equivalents, definitions and semantically related terms. For the implementation of the dictionary, we used LexO (Bellandi et al. 2017). The target group of the dictionary are students of mathematics who attend lectures in German and work with English resources. We carried out tests to understand which items the students search for when they work on graph-theoretical tasks. We ran the same test twice, with comparable student groups, either allowing Wikipedia as an information source or our dictionary. The dictionary seems to be especially helpful for students who already have a vague idea of a term because they can use the resource to check if their idea is right.
In this paper we investigate the coverage of the two knowledge sources WordNet and Wikipedia for the task of bridging resolution. We report on an annotation experiment which yielded pairs of bridging anaphors and their antecedents in spoken multi-party dialog. Manual inspection of the two knowledge sources showed that, with some interesting exceptions, Wikipedia is superior to WordNet when it comes to the coverage of information necessary to resolve the bridging anaphors in our data set. We further describe a simple procedure for the automatic extraction of the required knowledge from Wikipedia by means of an API, and discuss some of the implications of the procedure’s performance.
This paper introduces LRTwiki, an improved variant of the Likelihood Ratio Test (LRT). The central idea of LRTwiki is to employ a comprehensive domain specific knowledge source as additional “on-topic” data sets, and to modify the calculation of the LRT algorithm to take advantage of this new information. The knowledge source is created on the basis of Wikipedia articles. We evaluate on the two related tasks product feature extraction and keyphrase extraction, and find LRTwiki to yield a significant improvement over the original LRT in both tasks.
This paper will address the challenge of creating a knowledge graph from a corpus of historical encyclopedias with a special focus on word sense alignment (WSA) and disambiguation (WSD). More precisely, we examine WSA and WSD approaches based on article similarity to link messy historical data, utilizing Wikipedia as aground-truth component – as the lack of a critical overlap in content paired with the amount of variation between and within the encyclopedias does not allow for choosing a ”baseline” encyclopedia to align the others to. Additionally, we are comparing the disambiguation performance of conservative methods like the Lesk algorithm to more recent approaches, i.e. using language models to disambiguate senses.
Dieser Beitrag analysiert auf der Grundlage der Wikipedia-Korpora des Leibniz-Instituts für Deutsche Spra-che morphosyntaktische Phänomene im deutsch-italienischen Vergleich. Konkret fokussiert die Fallstudie Konfixe, die ursprünglich lateinischen bzw. griechischen Ursprungs waren und zunächst überwiegend für den Bereich der Medizinfachsprache entlehnt wurden. Mittlerweile werden diese mit veränderter Semantik jedoch auch für gemeinsprachliche Wortbildungsprodukte eingesetzt: So finden sich -phob- (D) und -fob- (IT) sowie -man- (D) und -man- (IT) in gemeinsprachlichen Wortbildungsprodukten, die formale und funk-tionale Äquivalenzen im Deutschen und Italienischen aufweisen. Wikipedia-Autor/-innen nutzen die als Krankheitsmetaphern zu deutenden Termini wie Lösch(o)manie oder cancellomania auf den Diskussionsseiten der Online-Enzyklopädie dazu, das Verhalten anderer Autor/-innen in der kollaborativen Textproduktion der Wikipedia metadiskursiv zu normieren.
Der Beitrag behandelt das Zusammenspiel von Text und Interaktion im Internet. Abschnitt 2 erläutert am Beispiel der Wikipedia, wie sich die textorientierte Arbeit an den Artikeln und das interaktionsorientierte Diskutieren funktional ergänzen. Abschnitt 3 untersucht Links als digitale Kohärenzbildungshilfen und zeigt an einem Fallbeispiel, wie diese in den schriftlichen Diskussionen dafür genutzt werden, relevante Informationen im „virtuellen“ Aufmerksamkeitsbereich präsent und für phorische und deiktische Bezugnahmen zugänglich zu machen. Abschnitt 4 diskutiert Ergebnisse aus zwei Vergleichsstudien zum Gebrauch der Konnektoren 'weil' sowie 'sprich' und 'd.h.' in Wikipedia-Artikeln und Diskussionen, die auf der Basis von Wikipedia-Korpora in der DeReKo-Sammlung des IDS durchgefuhrt wurden.