Refine
Year of publication
Document Type
- Conference Proceeding (13)
- Article (4)
- Doctoral Thesis (1)
- Other (1)
Has Fulltext
- yes (19)
Keywords
- Korpus <Linguistik> (9)
- Forschungsdaten (7)
- Metadatenmodell (4)
- Metadaten (3)
- Sprachproduktion (3)
- perception (3)
- prosody (3)
- syllable prominence (3)
- Dateiformat (2)
- Datenformat (2)
Publicationstate
- Veröffentlichungsversion (19) (remove)
Reviewstate
Publisher
- CLARIN (2)
- International Speech Communications Association (2)
- Linköping University Electronic Press (2)
- City University of Hong Kong (1)
- Elsevier (1)
- European Language Resources Association (1)
- International Speech Communication Association (1)
- Leibniz-Institut für Deutsche Sprache (IDS) (1)
- Springer (1)
- Springer Nature (1)
Die vorliegende Dissertation beschäftigt sich mit verschieden Methoden zur Erhebung von perzeptuellen Prominenzurteilen von naiven Hörern im Deutschen. Es werden zwei Experimente vorgestellt, die sich zum einen mit der Verwendung von verschiedenen Skalen, zum anderen mit der Verwendung von unterschiedlichen Bewertungsebenen zur Beurteilung von perzeptueller Prominenz beschäftigen. Die Ergebnisse zeigen, dass Ergebnisse von Studien, welche auf unterschiedlichen Erhebungstechniken beruhen nicht ohne weiteres vergleichbar sind. Die Arbeit untersucht außerdem die Effekte einer Normalisierung der Prominenzurteile. Die Dissertation schließt mit einem Ausblick für zukünftige Studien. Hierbei werden hauptsächlich die vielfältigen Interaktionen von verschiedenen Quellen und dem Kontext bei der Beurteilung der perzeptuellen Prominenz adressiert.
The CMDI Explorer
(2020)
We present the CMDI Explorer, a tool that empowers users to easily explore the contents of complex CMDI records and to process selected parts of them with little effort. The tool allows users, for instance, to analyse virtual collections represented by CMDI records, and to send collection items to other CLARIN services such as the Switchboard for subsequent processing. The CMDI Explorer hence adds functionality that many users felt was lacking from the CLARIN tool space.
CMDI Explorer
(2021)
We present CMDI Explorer, a tool that empowers users to easily explore the contents of complex CMDI records and to process selected parts of them with little effort. The tool allows users, for instance, to analyse virtual collections represented by CMDI records, and to send collection items to other CLARIN services such as the Switchboard for subsequent processing. CMDI Explorer hence adds functionality that many users felt was lacking from the CLARIN tool space.
This paper addresses long-term archival for large corpora. Three aspects specific to language resources are focused, namely (1) the removal of resources for legal reasons, (2) versioning of (unchanged) objects in constantly growing resources, especially where objects can be part of multiple releases but also part of different collections, and (3) the conversion of data to new formats for digital preservation. It is motivated why language resources may have to be changed, and why formats may need to be converted. As a solution, the use of an intermediate proxy object called a signpost is suggested. The approach will be exemplified with respect to the corpora of the Leibniz Institute for the German Language in Mannheim, namely the German Reference Corpus (DeReKo) and the Archive for Spoken German (AGD).
Signposts for CLARIN
(2020)
An implementation of CMDI-based signposts and its use is presented in this paper. Arnold et al. 2020 present Signposts as a solution to challenges in long-term preservation of corpora, especially corpora that are continuously extended and subject to modification, e.g., due to legal injunctions, but also may overlap with respect to constituents, and may be subject to migrations to new data formats. We describe the contribution Signposts can make to the CLARIN infrastructure and document the design for the CMDI profile.
Signposts for CLARIN
(2021)
An implementation of CMDI-based signposts and its use is presented in this paper. Arnold, Fisseni et al. (2020) present signposts as a solution to challenges in long-term preservation of corpora. Though applicable to digital resources in general, we focus on corpora, especially those that are continuously extended or subject to modification, e.g., due to legal injunctions, but also may overlap with respect to constituents, and may be subject to migrations to new data formats. We describe the contribution signposts can make to the CLARIN infrastructure, notably virtual collections, and document the design for the CMDI profile.
In diesem Beitrag widmen wir uns der Frage, welche Schritte unternommen werden müssen, um Skripte, die bei der Aufbereitung und/oder Auswertung von Forschungsdaten Anwendung finden, so FAIR wie möglich zu gestalten. Dabei nehmen wir sowohl Reproduzierbarkeit, also den Weg von den (Roh)daten zu den Ergebnissen einer Studie, als auch Wiederverwertbarkeit, also die Möglichkeit, die Methoden einer Studie mittels des Skripts auf andere Daten anzuwenden, in den Fokus und beleuchten dabei die folgenden Aspekte: Arbeitsumgebung, Datenvalidierung, Modularisierung, Dokumentation und Lizenz.
Prominence has been widely studied on the word level and the syllable level. An extensive study comparing the two approaches is missing in the literature. This study investigates how word and syllable prominence relate to each other in German. We find that perceptual ratings based on the word level are more extreme than those based on the syllable level. The correlations between word prominence and acoustic features are greater than the correlations between syllable prominence and acoustic features.
Sound units play a pivotal role in cognitive models of auditory comprehension. The general consensus is that during perception listeners break down speech into auditory words and subsequently phones. Indeed, cognitive speech recognition is typically taken to be computationally intractable without phones. Here we present a computational model trained on 20 hours of conversational speech that recognizes word meanings within the range of human performance (model 25%, native speakers 20–44%), without making use of phone or word form representations. Our model also generates successfully predictions about the speed and accuracy of human auditory comprehension. At the heart of the model is a ‘wide’ yet sparse two-layer artificial neural network with some hundred thousand input units representing summaries of changes in acoustic frequency bands, and proxies for lexical meanings as output units. We believe that our model holds promise for resolving longstanding theoretical problems surrounding the notion of the phone in linguistic theory.