Sprache, Linguistik
Refine
Year of publication
Document Type
- Part of a Book (3245)
- Article (1893)
- Book (463)
- Conference Proceeding (346)
- Part of Periodical (179)
- Review (140)
- Other (119)
- Working Paper (48)
- Report (16)
- Preprint (11)
Language
- German (5505)
- English (918)
- French (14)
- Multiple languages (13)
- Russian (13)
- Portuguese (6)
- Spanish (4)
- Polish (3)
- Bulgarian (1)
- Croatian (1)
Keywords
- Deutsch (3870)
- Korpus <Linguistik> (683)
- Wörterbuch (328)
- Konversationsanalyse (293)
- Grammatik (289)
- Gesprochene Sprache (263)
- Sprachgebrauch (263)
- Sprachgeschichte (253)
- Rezension (251)
- Interaktion (219)
Publicationstate
- Veröffentlichungsversion (2617)
- Zweitveröffentlichung (1025)
- Postprint (236)
- Preprint (9)
- Ahead of Print (5)
- Erstveröffentlichung (5)
- (Verlags)-Lektorat (3)
- Verlags-Lektorat (1)
Reviewstate
- (Verlags)-Lektorat (2530)
- Peer-Review (987)
- Verlags-Lektorat (67)
- Peer-review (34)
- Review-Status-unbekannt (10)
- (Verlags-)Lektorat (8)
- Qualifikationsarbeit (Dissertation, Habilitationsschrift) (8)
- (Verlags-)lektorat (4)
- Peer-Revied (3)
- Verlagslektorat (3)
Publisher
- de Gruyter (972)
- Institut für Deutsche Sprache (849)
- Schwann (623)
- Narr (382)
- Niemeyer (168)
- Leibniz-Institut für Deutsche Sprache (IDS) (162)
- Lang (133)
- De Gruyter (113)
- IDS-Verlag (81)
- Verlag für Gesprächsforschung (79)
Sprachbewertung – Wozu?
(1995)
Der Beitrag arbeitet die Funktion von Sprachbewertungen für einzelne Sprachteilhaber wie für Sprachgemeinschaften heraus. Zwei wesentliche Funktionen von Sprachbewertungen werden unterschieden: Auf einer Stufe I fördern sie die Herausbildung einer kommunikativen Kompetenz, die kommunikationsethischen Forderungen genügt; auf einer Stufe II bewahren und gestalten sie ein „kommunikatives Milieu”, das kommunikationsethisch wünschbare Ausprägungen dieser Kompetenz ermöglicht. Bestimmte Forderungen an das „kommunikative Milieu” werden in einer humanökologischen kommunikativen Ethik begründet und auf Sprachentwicklungserscheinungen bezogen, die durch Veränderungen der Kommunikationsbedingungen und der „Medienumwelt” in der Sprachgemeinschaft bedingt sind.
Inducing linguistic networks from historical corpora. Towards a new method in historical semantics
(2013)
In this paper, we experiment with exploring linguistic networks as a new method in historical semantics. Our starting point is a long-term historical corpus (i.e., the Patrologia Patina) which we analyse regarding the conceptual stability of a key concept in medieval literature (i.e., virtus). Most analyses in historical semantics explore small data sets by focusing on narrow contexts of lexical usages, but we propose a more comprehensive method based on lexical networks that represent the underlying documents as a whole. We demonstrate both the topological stability of document-based lexical networks and their usefulness in providing empirical evidence in historical semantics.
We present LatinlSE, a Latin corpus for the Sketch Engine. LatinlSE consists of Latin works comprising a total of 13 million words, covering the time span from the 2nd Century BC to the 21st century AD. LatinlSE is provided with rich metadata mark-up, including author, title, genre, era, date and century, as well as book, section, paragraph and line of verses. We have automatically annotated LatinlSE with lemma and part-of-speech information, enabling users to search the corpus with a number of criteria, ranging from lemma, part-of speech, context, to subcorpora defined chronologically or by genre. We also illustrate word sketches, one-page summaries of a word’s corpus based collocational behaviour. Our future plan is to produce word sketches for Latin words by adding richer morphological and syntactic annotation to the corpus.
Numerous high-quality primary textual resources - in the context of this paper, this means fulltext transcriptions (and corresponding image scans) of German texts originating from the 15th to the 19th century - are scattered among the web or stored remotely on institutional or private servers. They are often filed on degrading recording media and are encoded in out-of-date or inflexible storage formats. Often, textual resources are accompanied by scarce, insufficient or inaccurate bibliographic information, which is only one further reason why valuable resources, even if available on the web, remain undiscovered. Additionally, idiosyncratic, project-specific markup conventions often hinder further usage and analysis of the data. Because of these and other problems, a great amount of the abovementioned transcriptions of historical sources can hardly be found, let alone accessed by third parties, and are of little use to the wider research community. This situation is unsatisfying from the perspective of a (corpus-)linguistic project like the one described here, but also from the perspective of any text-based research in the humanities and social sciences. The integration of as many of these ‘dispersed’ high-quality primary textual resources as possible into an encompassing repository like the sustainable, web and centres-based research infrastructure of CLARIN-D1 2 is an important step and at least a necessary prerequisite to solve this problem. This paper summarizes the work of an 18-month project funded by the German Federal Ministry of Education and Research (BMBF) which dealt with the curation and integration of historical text resources of the 15th-19th century into the CLARIN-D infrastructure.
This paper aims to point out a linguistic phenomenon that due to the current stage of research can be analysed only insufficiently with the help of an electronic text corpus. In this way, the paper adds a new aspect to the discussion about historical corpora by tackling the question of how they should be designed in order to be useful for linguistic research on so-called formulaic patterns. The novelty of the question becomes apparent considering the fact that at present such historical corpora do not exist. In section 1, we define the term formulaic pattern because a clear understanding of this phenomenon is a prerequisite condition for collaborative research of it by historians of language and corpus and computer linguists. Section 2 gives a brief outline of the state of the art in the field of modern formulaic language within the framework of corpus and computer linguistics. Section 3 shows that some well known problems in this area are exacerbated when applied to historical texts. Section 4 presents a possible solution that has been implemented by the HiFoS Research Group at the University of Trier (Germany). Joint research efforts planned with UKP Lab at the TU Darmstadt (section 5) demonstrate that the restrictions posed by historical formulaic patterns are challenges to be overcome, rather than insurmountable obstacles.
Wir beschreiben zwei gesprochene Korpora mit deutschen Muttersprachler/-innen (BeDiaCo, 36 Versuchspersonen) sowie deutschen Muttersprachler/-innen und Lerner/-innen des Deutschen (CoNNAR, 56 Versuchspersonen). Beide Korpora enthalten gelesene Wortlisten und spontane gesprochene Dialoge derselben Sprecher/-innen in verschiedenen Situationen (freie Konversation, aufgabenbasierter Dialog). Die Erhebungen fanden teilweise von Angesicht zu Angesicht und teilweise über ein Videokonferenztool statt. Beide Korpora sind aus spezifischen Forschungsfragen heraus entstanden und für linguistische Forschung wiederverwendbar. Anhand zweier Fallstudien zu Artikulationsgeschwindigkeit (BeDiaCo) und Füllpartikeln (CoNNAR) wird ein beispielhafter Einblick zu möglichen Forschungsfragen gegeben.
Der vorliegende Aufsatz stellt eine Sammlung von Briefen aus dem 16. Jahrhundert von und an den Zürcher Reformator Heinrich Bullinger vor. Von Bullingers Briefwechsel sind rund 12.000 Briefe erhalten, etwa ein Viertel davon ist in Frühneuhochdeutsch verfasst und stammt von mehr als 300 Personen. Im Rahmen des laufenden Projektes „Bullinger Digital“ werden die vorhandenen Wissensquellen zusammengetragen und digital aufbereitet sowie weitere Informationen erschlossen. Bereits entwickelt wurden eigene Verfahren zur Sprachidentifikation und Normalisierung, die im vorliegenden Aufsatz kurz vorgestellt werden. Mit der Sprachidentifikation werden zuverlässig alle frühneuhochdeutschen Sätze im Briefwechsel erkannt, die Normalisierung der frühneuhochdeutschen Wortformen erhöht die Benutzerfreundlichkeit des Korpus. Der Briefwechsel ist online durchsuchbar, die Speicherung
in TEI konformem XML ermöglicht dessen Weiternutzung.
In this paper we introduce the task of aligning parallel historical texts, to create synopses for comparing similarities and deviations between them. We present guidelines for manually annotating corresponding words and phrases. A test annotation reveals that there is considerable high inter-annotator agreement, ranging from kappa = 0.76 to 0.98, depending on the specific text. In an application scenario we show a typical use case for which token and phrase alignments are of value.
The aim of this paper is to introduce university archives as valuable sources for document-centric historical research. That comprises the history of science as well as the history of society. With the example of Leipzig as a university city with an outstanding wealth of archived academic material we want to stress on the great and in many cases not yet digitally explored potential of such sources.
We then focus on a collection of annual administrative speeches called “Rektoratsreden” that span over 60 important years of Leipzig’s university life. We discuss some of the possibilities for content analysis using methods of Natural Language Processing (NLP). The focus lies on facilitating the access to larger corpora. We present a minimalist process chain for a distant-reading, explorative approach on the Rektoratsreden-corpus. For more general considerations we also highlight some of the digitization efforts that took place in Leipzig and reflect on how archive material as well as archival workflows can benefit from research infrastructure and vice versa.
This study shows that historical corpora can reveal the selective effect of register for favouring or disfavouring linguistic innovations. Four innovations in the history of French are considered, and show contrasting patterns as regards the adoption of new forms or meanings. Two are favoured in the spoken and two in the written register corpus. The corpora are both of Anglo-Norman, for which the existence of a corpus of courtroom dialogues allows a probably unique opportunity to study a representation of authentic spoken register use in the medieval period, in addition to a corpus of written-origin data belonging to the same discourse domain.