Refine
Year of publication
Document Type
- Part of a Book (58)
- Conference Proceeding (36)
- Article (27)
- Contribution to a Periodical (9)
- Working Paper (9)
- Book (8)
- Other (3)
- Preprint (2)
- Part of Periodical (1)
Keywords
- Korpus <Linguistik> (97)
- Gesprochene Sprache (66)
- Deutsch (41)
- Transkription (32)
- Computerlinguistik (24)
- gesprochene Sprache (18)
- Annotation (15)
- Konversationsanalyse (12)
- Forschungsdaten (10)
- Sprachdaten (10)
Publicationstate
- Veröffentlichungsversion (47)
- Zweitveröffentlichung (37)
- Postprint (9)
- Erstveröffentlichung (1)
Reviewstate
- (Verlags)-Lektorat (45)
- Peer-Review (35)
- (Verlags-)lektorat (1)
- Peer-review (1)
- Verlags-Lektorat (1)
Publisher
- de Gruyter (23)
- European Language Resources Association (ELRA) (7)
- Institut für Deutsche Sprache (7)
- Narr (7)
- Cambridge Scholars Publ. (5)
- De Gruyter (5)
- European Language Resources Association (5)
- Leibniz-Institut für Deutsche Sprache (IDS) (5)
- Verlag für Gesprächsforschung (4)
- Linköping University Electronic Press (3)
Dieser Beitrag widmet sich der Beschreibung des Korpus Deutsch in Namibia (DNam), das über die Datenbank für Gesprochenes Deutsch (DGD) frei zugänglich ist. Bei diesem Korpus handelt es sich um eine neue digitale Ressource, die den Sprachgebrauch der deutschsprachigen Minderheit in Namibia sowie die zugehörigen Spracheinstellungen umfassend und systematisch dokumentiert. Wir beschreiben die Datenerhebung und die dabei angewandten Methoden (freie Gespräche, „Sprachsituationen“, semi-strukturierte Interviews), die Datenaufbereitung inklusive Transkription, Normalisierung und Tagging sowie die Eigenschaften des verfügbaren Korpus (Umfang, verfügbare Metadaten usw.) und einige grundlegende Funktionalitäten im Rahmen der DGD. Erste Forschungsergebnisse, die mithilfe der neuen Ressource erzielt wurden, veranschaulichen die vielseitige Nutzbarkeit des Korpus für Fragestellungen aus den Bereichen Kontakt-, Variations-
und Soziolinguistik.
High word frequency and neighborhood density contribute to the accuracy and speed of word production in English adults (e.g., Vitevitch & Sommers 2003), and characterize early words in child English (e.g., Storkel 2004). The present study investigated a speech corpus of child German (ages 2;00-3;00) to further the understanding of the influence of frequency and density on production. Results for four children suggest that, contrary to English, words produced early are not from denser neighborhoods in an adult lexicon than later words. As in English, frequent words are produced before less frequent words. Implications on theory and methodology are discussed.
This paper presents EXMARaLDA, a system for the computer-assisted creation and analysis of spoken
language corpora. The first part contains some general observations about technological and methodological requirements for doing corpus-based pragmatics. The second part explains the systems architecture and gives an overview of its most important software components a transcription editor, a corpus management tool and a corpus query tool. The last part presents some corpora which have been or are currently being compiled with the help of EXMARaLDA.
Die Guidelines sind eine Erweiterung des STTS (Schiller et al. 1999) für die Annotation von Transkripten gesprochener Sprache. Dieses Tagset basiert auf der Annotation des FOLK-Korpus des IDS Mannheim (Schmidt 2014) und es wurde gegenüber dem STTS erweitert in Hinblick auf typisch gesprochensprachliche Phänomene bzw. Eigenheiten der Transkription derselben. Es entstand im Rahmen des Dissertationsprojekts „POS für(s) FOLK – Entwicklung eines automatisierten Part-of-Speech-Tagging von spontansprachlichen Daten“ (Westpfahl 2017 (i.V.)).
In this paper, we present a GOLD standard of part-of-speech tagged transcripts of spoken German. The GOLD standard data consists of four annotation layers – transcription (modified orthography), normalization (standard orthography), lemmatization and POS tags – all of which have undergone careful manual quality control. It comes with guidelines for the manual POS annotation of transcripts of German spoken data and an extended version of the STTS (Stuttgart Tübingen Tagset) which accounts for phenomena typically found in spontaneous spoken German. The GOLD standard was developed on the basis of the Research and Teaching Corpus of Spoken German, FOLK, and is, to our knowledge, the first such dataset based on a wide variety of spontaneous and authentic interaction types. It can be used as a basis for further development of language technology and corpus linguistic applications for German spoken language.
During the second half of the 19th century, extended regions of the South Pacific came to be part of the German colonial empire. The colonial administration included repeated and diverse efforts to implement German as the official language in several settings (administration, government, education) in the colonial areas. Due to unfamiliar sociological and linguistic conditions, to competition with English as a(nother) prestigious colonizer language, and to the short time-span of the German colonial rule, these efforts rendered only little language-related effect. Nevertheless, some linguistic traces remained, and these seem to reflect in what areas language implementation was organized most thoroughly. The study combines two directions of investigation: First, taking a historical approach, legal and otherwise official documents and information are considered in order to understand how the implementation process was planned and (intended to be) carried out. Second, from a linguistic perspective, documented lexical borrowings and other traces of linguis tic contact are identified that can corroborate the historical findings by reflecting a greater effect of contact in such areas where the implementation of German was carried out most strictly. The goal of this paper is, firstly, to trace the political and missionary activities in language planning with regard to German in the colonial Pacific, rather similar to a modem language policy scenario when a new code of prestige or national unity is implemented. Secondly, these activities are evaluated in the face of the outcome that can be observed, in the historical practice as well as in long-term effects of language contact up until today.