Refine
Year of publication
- 2013 (22) (remove)
Document Type
- Part of a Book (8)
- Article (6)
- Conference Proceeding (6)
- Book (1)
- Part of Periodical (1)
Has Fulltext
- yes (22)
Keywords
- Korpus <Linguistik> (22) (remove)
Publicationstate
Reviewstate
- (Verlags)-Lektorat (9)
- Peer-Review (4)
- Verlags-Lektorat (1)
- Zweitveröffentlichung (1)
Publisher
- Narr (3)
- GSCL (2)
- UCREL (2)
- ACM (1)
- Association for Computational Linguistics (1)
- Gesellschaft für Sprachtechnologie und Computerlinguistik (1)
- Hempen (1)
- Institut für Deutsche Sprache (1)
- Köllen (1)
- Lang (1)
Im vorliegenden Beitrag wird anhand von Fallstudien der Frage nachgegangen, welche Dialektkompetenz speziell diejenigen russlanddeutschen Aussiedler der Einwanderungsgeneration mitbringen, die zwar in deutschen Sprachinseln geboren und aufgewachsen sind, einen Großteil des erwachsenen Lebens jedoch in russischsprachiger Umgebung verbracht haben.
"Webkorpora in Computerlinguistik und Sprachforschung" war das Thema eines Workshops,der von den beiden GSCL-Arbeitskreisen „Hypermedia“ und „Korpuslinguistik“ am Institut für Deutsche Sprache (IDS) in Mannheim veranstaltet wurde, und zu dem sich am 27.09. und 28.09.2012 Experten aus universitären und außeruniversitären Forschungseinrichtungen zu Vorträgen und Diskussionen zusammenfanden. Der facettenreiche Workshop thematisierte Fragen der Gewinnung, der Aufbereitung und der Analyse von Webkorpora für computerlinguistische Anwendungen und sprachwissenschaftliche Forschung. Einen Schwerpunkt bildeten dabei die speziellen Anforderungen, die sich gerade im Hinblick auf deutschsprachige Ressourcen ergeben. Im Fokus stand weiterhin die Nutzung von Webkorpora für die empirisch gestützte Sprachforschung, beispielsweise als Basis für sprachstatistische Analysen, für Untersuchungen zur Sprachlichkeit in der internetbasierten Kommunikation oder für die korpusgestützte Lexikographie. Zusätzlich gab es eine Poster/Demosession, in der wissenschaftliche und kommerzielle Projekte ihre Forschungswerkzeuge und Methoden vorstellen konnten.
We investigate the task of detecting reliable statements about food-health relationships from natural language texts. For that purpose, we created a specially annotated web corpus from forum entries discussing the healthiness of certain food items. We examine a set of task-specific features (mostly) based on linguistic insights that are instrumental in finding utterances that are commonly perceived as reliable. These features are incorporated in a supervised classifier and compared against standard features that are widely used for various tasks in natural language processing, such as bag of words, part-of speech and syntactic parse information.
Among the German negative-conditional connectors in the range of consequens markers there are the prototypical cases sonst and ansonsten. Morphological alternatives (sonsten and ansonst) are rarely mentioned in contemporary grammars and dictionaries but they actually occur with considerable frequency. The four connectors are used in two functions: as a conjunctional adverb which can occupy various positions within the sentence or as a specific kind of subordinating conjunction (Postponierer). The large IDS corpora allow us to reveal specific distributions of the lexemes and of their different ways of use. Comparing the frequencies and the distributions can indicate to which extent the phenomena are part of the standard language. The paper will report on the results and demonstrate how the findings can be deduced from the corpora. It will draw conclusions for assessing the acceptability of the variants and the extent to which they can be considered standard language additionally testing statistical instruments to visualise and calculate the variance of phenomena as association plots and DPnorm.
Investigating the history of a language depends on fragmentary sources, but electronic corpora offer the possibility of alleviating the problem of ‘bad data’. However they cannot overcome it totally, and crucial questions thus arise of the optimal architecture for such a corpus, the problem of how representative even a large corpus can be of actual language use at a particular time, and how a historical corpus can best be annotated and provided with tools to maximize its usefulness as a resource for future researchers. Immense strides have been made in recent years in addressing these questions, with exciting new methods and technological advances. The papers in this volume, which were presented at a conference on New Methods in Historical Corpora (Manchester 2011), exemplify the range of these developments in investigating the diachrony of languages as distinct as English, German, Latin, Spanish, French and Slovene and developing appropriate tools for the analysis of historical corpora in these languages.