OPUS 4 | Search

22 search hits

1 to 10

Sort by

XML-Kodierung des Bonner Frühneuhochdeutschkorpus (2002)

Diel, Marcel ; Fisseni, Bernhard ; Lenders, Winfried ; Schmitz, Hans-Christian

Visualisierung von lexikalischem Wandel im Deutschen auf Basis der Google-Books Ngram Daten (2014)

In diesem Arbeitspapier wird gezeigt, wie mit Hilfe der Google-‐Books Ngram Daten (Michel u.a., 2010a, 2010b) lexikalischer Sprachwandel visualisiert werden kann.

User's Guide for the ZAS Database of Clause-Embedding Predicates (2017)

Stiebels, Barbara ; McFadden, Thomas ; Schwabe, Kerstin ; Solstad, Torgrim ; Kellner, Elisa ; Sommer, Livia ; Stoltmann, Katarzyna

The Partitur Format at BAS (1997)

Schiel, Florian ; Burger, Susanne ; Geumann, Anja ; Weilhammer, Karl

Most spoken language resources are produced and disseminated together with symbolic information relating to the speech signal. These are for instance orthographic transcript labeling and segmentation on the phonologic phoneti prosodic phrasal level. Most of the known formats for these symbolic data are defined in a ‘closed form’ that is not fexible enough to allow simple and platform independent processing and easy extensions. At the Bavarian Archive for Speech Signals (BAS) a new format has been developed and used over the last few years that shows some significant advantages over other existing formats. This paper describes the basic principles behind this format discusses briefly the advantages and gives detailed definitions of the description levels used so far.

The conventions for phonetic transcription and segmentation of German used for the Munich Vermobil corpus (1997)

Geumann, Anja ; Oppermann, Daniela ; Schaeffler, Felix

STTS 2.0. Guidelines für die Annotation von POS -Tags für Transkripte gesprochener Sprache in Anlehnung an das Stuttgart Tübingen Tagset (STTS) (2017)

Westpfahl, Swantje ; Schmidt, Thomas ; Jonietz, Jasmin ; Borlinghaus, Anton

Die Guidelines sind eine Erweiterung des STTS (Schiller et al. 1999) für die Annotation von Transkripten gesprochener Sprache. Dieses Tagset basiert auf der Annotation des FOLK-Korpus des IDS Mannheim (Schmidt 2014) und es wurde gegenüber dem STTS erweitert in Hinblick auf typisch gesprochensprachliche Phänomene bzw. Eigenheiten der Transkription derselben. Es entstand im Rahmen des Dissertationsprojekts „POS für(s) FOLK – Entwicklung eines automatisierten Part-of-Speech-Tagging von spontansprachlichen Daten“ (Westpfahl 2017 (i.V.)).

Metadaten im Programmbereich „Mündliche Korpora“ des IDS (2017)

Dickgießer, Sylvia

Maskierung (2015)

Winterscheid, Jenny

Aus forschungsethischen Gründen müssen die Daten aus Gesprächsaufzeichnungen, die Metadaten sowie die Transkripte maskiert werden. Der Beitrag stellt Arbeitsschritte der Maskierung vor, die auf den Erfahrungen bei der Datenaufbereitung der Daten des Forschungs- und Lehrkorpus Gesprochenes Deutsch (FOLK) für die Veröffentlichung in der Datenbank für Gesprochenes Deutsch (DGD) basieren.

Language Resources and Research under the General Data Protection Regulation (2018)

Kamocki, Paweł ; Ketzan, Erik ; Wildgans, Julia

The General Data Protection Regulation (hereinafter: GDPR), EU Regulation 2016/679 of 27 April 2016, will become applicable on 25 May 2018 and repeal the Personal Data Directive of 24 October 1995. Unlike a directive, which requires transposition into national laws (while leaving the choice of “forms and methods” to the Member States), a regulation is binding and directly applicable in all Member States. This means that when the GDPR becomes applicable, all the EU countries will have the same rules regarding the protection of personal data — at least in principle, since some details (including in the area of research — see below) are expressly left to the discretion of the Member States. The GDPR is a particularly ambitious piece of legislation (consisting of 99 articles and 173 recitals) whose intended territorial scope extends beyond the borders of the European Union. Its main concepts and principles are essentially similar to those of the Personal Data Directive, but enriched with interpretation developed through the case law of the CJEU and the opinions of the Article 29 Data Protection Working Party (hereinafter: WP29). This White Paper will discuss the main principles of data protection and their impact on language resources, as well as special rules regarding research under the GDPR and the standardisation mechanisms recognized by the Regulation.

KoralQuery 0.3 (2015)

Diewald, Nils ; Bingel, Joachim

KoralQuery is a general corpus query protocol (i.e. independent of research tasks and corpus formats), serialized in JSON-LD [1]. KoralQuery focuses on simplicity of implementation rather than human readibility and writability. Support for a growing number of query languages is granted by the Koral serialization processor.

1 to 10

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Publicationstate

Reviewstate

Publisher

22 search hits