Refine
Year of publication
Document Type
- Conference Proceeding (688) (remove)
Keywords
- Korpus <Linguistik> (237)
- Deutsch (167)
- Computerlinguistik (117)
- Annotation (65)
- Automatische Sprachanalyse (53)
- Gesprochene Sprache (53)
- Natürliche Sprache (41)
- Forschungsdaten (38)
- Information Extraction (30)
- Metadaten (30)
Publicationstate
- Veröffentlichungsversion (442)
- Zweitveröffentlichung (81)
- Postprint (38)
- Preprint (1)
Reviewstate
- Peer-Review (328)
- (Verlags)-Lektorat (137)
- Peer-review (9)
- Review-Status-unbekannt (7)
- Peer review (1)
- Verlags-Lektorat (1)
Publisher
- European Language Resources Association (ELRA) (50)
- Association for Computational Linguistics (43)
- European Language Resources Association (35)
- Institut für Deutsche Sprache (17)
- Zenodo (15)
- Lexical Computing CZ s.r.o. (12)
- Linköping University Electronic Press (12)
- CLARIN (11)
- International Speech Communication Association (9)
- Leibniz-Institut für Deutsche Sprache (9)
TripleA is a workshop series founded by linguists from the University of Tübingen and the University of Potsdam. Its aim is to provide a forum for semanticists doing fieldwork on understudied languages, and its focus is on languages from Africa, Asia, Australia and Oceania. The second TripleA workshop was held at the University of Potsdam, June 3-5, 2015.
American English and German AI, AU observed in cognates such as Wein, wine, Haus, house are usually treated on a par, represented with the same initial vowel (cf. [ai], [au] for Am. Engl, and German [1]). Yet, acoustic measurements indicate differences as the relevant trajectories characteristically cross in Am. Engl, but not in German. These data may indicate consistency with the same initial target for these diphthongs in German, supporting the choice of the same Symbol /a/ in phonemic representation, as opposed to distinct targets (and distinct initial phonemes) in American English.
Languages vary in whether or not their future markers are compatible with non-future modal readings (Tonhauser, 2011b). The present paper proposes that this Variation is determined by the aspectual architecture of a given language, more precisely if and how aspects can be stacked. Building on recent accounts of the temporal interpretation of modals (Matthewson, 2012, 2013; Kratzer, 2012; Chen et al., ta), the paper first sketches an analysis of the temporal readings of the English future marker will and then provides cross-linguistic comparison with a selected, typologically diverse set of languages (Medumba, Hausa, Gitksan, and Greek).
A comparison between morphological complexity measures: typological data vs. language corpora
(2016)
Language complexity is an intriguing phenomenon argued to play an important role in both language learning and processing. The need to compare languages with regard to their complexity resulted in a multitude of approaches and methods, ranging from accounts targeting specific structural features to global quantification of variation more generally. In this paper, we investigate the degree to which morphological complexity measures are mutually correlated in a sample of more than 500 languages of 101 language families. We use human expert judgements from the World Atlas of Language Structures (WALS), and compare them to four quantitative measures automatically calculated from language corpora. These consist of three previously defined corpus-derived measures, which are all monolingual, and one new measure based on automatic word-alignment across pairs of languages. We find strong correlations between all the measures, illustrating that both expert judgements and automated approaches converge to similar complexity ratings, and can be used interchangeably.
The FrameNet lexical database yields information about collocations and multiword expressions in various ways. In some cases phrasal units have been entered from the start as lexical entries (write down). In other cases headword + preposition pairs can be recognized as special collocations Where the preposition in question is a necessary and lexically specified marker of an argument of the headword + fond of, hostile to). Nominal compounds are annotated with respect to noun or (pertinative) adjective modifiers, some of which are analyzable but also entrenched (wheel chair, fiscal year). Nouns that name aggregates, portions, types, etc., sometimes hold lexically specified relations to their dependents (flock of geese). And event nouns frequently Select the support verbs which permit them to enter into predications (file an objection, enter a plea). A subproject aims at extracting, as structured clusters of lexical items, the minimal semantically central kernel dependency graphs from the set of annotations. Such research will yield not only commonplace groupings (eat: dog, bone) but will also yield hitherto unnoticed collocations within such graphs (answer: you, door) where certain dependency links within them are idiomatic or otherwise lexically special, here answer > door. Collocational information can also be retrieved by various types of queries within our MySQL search tool
This paper presents the current results of an ongoing research project on corpus distribution of prepositions and pronouns within Polish preposition-pronoun contractions. The goal of the project is to provide a quantitative description of Polish preposition-pronoun contractions taking into consideration morphosyntactic properties of their components. It is expected that the results will provide a basis for a revision of the traditionally assumed inflectional paradigms of Polish pronouns and, thus, for a possible remodeling of these paradigms. The results of corpus-based investigations of the distribution of prepositions within preposition-pronoun contractions can be used for grammar-theoretical and lexicographic purposes.
Ein integriertes Datenbank-, Such- und Tagging-Tool (IDaSTo) wird vorgestellt, das sich besonders für Variablenanalysen, für Paralleltexte und für diachronische Untersuchungen eignet. Relevante Kategorien bzw. Variablen können individuell definiert, Tags frei im Text und auf verschiedenen Wegen gesetzt und ihre Häufigkeiten in den verlinkten Statistiken direkt abgerufen werden.