Refine
Year of publication
- 2011 (32) (remove)
Document Type
- Conference Proceeding (16)
- Article (13)
- Part of a Book (1)
- Review (1)
- Working Paper (1)
Has Fulltext
- yes (32)
Keywords
- Computerlinguistik (8)
- Deutsch (7)
- Datenmanagement (4)
- Metadaten (4)
- Maschinelles Lernen (3)
- Sentimentanalyse (3)
- Annotation (2)
- Automatische Sprachanalyse (2)
- Computerunterstützte Lexikografie (2)
- Englisch (2)
Publicationstate
- Veröffentlichungsversion (22)
- Zweitveröffentlichung (6)
- Postprint (2)
- (Verlags)-Lektorat (1)
Reviewstate
- Peer-Review (32) (remove)
Publisher
- GSCL (2)
- Incoma Ltd. (2)
- Trojina, Institute for Applied Slovene Studies (2)
- Universität Hamburg - Sonderforschungsbereich 538 (2)
- Association for Computational Linguistics (1)
- Benjamins (1)
- Centre de linguistique appliquée (1)
- City University of Hong Kong (1)
- Dipartimento di Linguistica, Università di Pisa (1)
- Editorial Universitat Politècnica de València (1)
Linguistics is facing the challenge of many other sciences as it continues to grow into increasingly complex subfields, each with its own separate or overarching branches. While linguists are certainly aware of the overall structure of the research field, they cannot follow all developments other than those of their subfields. It is thus important to help specialists but also newcomers alike to bushwhack through evolved or unknown territory of linguistic data. A considerable amount of research data in linguistics is described with metadata. While studies described and published in archived journals and conference proceedings receive a quite homogeneous set of metadata tags — e.g., author, title, publisher —, this does not hold for the empirical data and analyses that underlie such studies. Moreover, lexicons, grammars, experimental data, and other types of resources come in different forms; and to make things worse, their description in terms of metadata is also not uniform, if existing at all. These problems are well-known and there are now a number of international initiatives — e.g., CLARIN, FlareNet, MetaNet, DARIAH — to build infrastructures for managing linguistic resources. The NaLiDa project, funded by the German Research Foundation, aims at facilitating the management and access to linguistic resources originating from German research institutions. In cooperation with the German SFB 833 research center, we are developing a combination of faceted and full-text search to give integrated access through heterogeneous metadata sets. Our approach is supported by a central registry for metadata field descriptors, and a component repository for structured groups of data categories as larger building blocks.
This paper demonstrates systematic cross-linguistic differences in the electrophysiological correlates of conflicts between form and meaning (“semantic reversal anomalies”). These engender P600 effects in English and Dutch (e.g. Kolk et al., 2003, Kuperberg et al., 2003), but a biphasic N400 – late positivity pattern in German (Schlesewsky and Bornkessel-Schlesewsky, 2009), and monophasic N400 effects in Turkish (Experiment 1) and Mandarin Chinese (Experiment 2). Experiment 3 revealed that, in Icelandic, semantic reversal anomalies show the English pattern with verbs requiring a position-based identification of argument roles, but the German pattern with verbs requiring a case-based identification of argument roles. The overall pattern of results reveals two separate dimensions of cross-linguistic variation: (i) the presence vs. absence of an N400, which we attribute to cross-linguistic differences with regard to the sequence-dependence of the form-to-meaning mapping and (ii) the presence vs. absence of a late positivity, which we interpret as an instance of a categorisation-related late P300, and which is observable when the language under consideration allows for a binary well-formedness categorisation of reversal anomalies. We conclude that, rather than reflecting linguistic domains such as syntax and semantics, the late positivity vs. N400 distinction is better understood in terms of the strategies that serve to optimise the form-to-meaning mapping in a given language.
In this paper, we investigate the role of predicates in opinion holder extraction. We will examine the shape of these predicates, investigate what relationship they bear towards opinion holders, determine what resources are potentially useful for acquiring them, and point out limitations of an opinion holder extraction system based on these predicates. For this study, we will carry out an evaluation on a corpus annotated with opinion holders. Our insights are, in particular, important for situations in which no labelled training data are available and only rule-based methods can be applied.
The planning of a dictionary should consider both theoretical and empiric aspects, either for its macro- and microstructure: this is true also for Online Specialized Dictionaries of Linguistics. In particular the microstructure should be standardized and structured so as to fit with the primary and secondary functions of a dictionary. Unfortunately, empirical studies that investigate Online Specialized Dictionaries of Linguistics are rare, making it unclear which microstructural elements are obligatory and which are facultative. This article will present and comment upon the results of an investigation into a corpus of Online Specialized Dictionaries of Linguistics, focusing attention on these aspects and also the most important theoretical issues. An example taken from DIL, a German-Italian Online Dictionary of Linguistics, will end the article.
This article looks at Latgalian from a perspective of a classification of languages. It starts by discussing relevant terms relating to sociolinguistic language types. It argues that Latgalian and its speakers show considerable similarities with many languages in Europe which are considered to be regional languages – hence, also Latgalian should be classified as such. In a second part, the article uses sociolinguistic data to indicate that the perceptions of speakers confirm this classification. Therefore, Latgalian should also officially be treated with the respect that other regional languages in Europe enjoy.
Relationale Adjektive, also Adjektive, die aus Substantiven abgeleitet werden und die in attributiver Konstruktion mit einem Kopfsubstantiv eine unspezifische Relation zwischen dem Begriff des Kopfs und dem Begriff der Basis ausdrücken, spielen in den klassischen Sprachen eine bedeutende Rolle. Ausgehend von der silvestris musa, der Waldmuse des Vergil, wird in dem vorliegenden Beitrag den Nachwirkungen dieses Musters in europäischen Sprachen, dem Französischen, Englischen, vor allem aber im Deutschen nachgegangen. Die semantische Funktion solcher Adjektive wird der funktionalen Domäne ‚klassifikatorische Modifikation‘ zugeordnet. Sprachübergreifende Gemeinsamkeiten und Unterschiede werden herausgearbeitet. In knapper Form werden auch relationale Adjektive im Polnischen und Ungarischen, den weiteren Vergleichssprachen des Projekts „Grammatik des Deutschen im europäischen Vergleich“, einbezogen. Die Frage nach dem Verhältnis von universalen, sprachfamiliären, arealen und sprachspezifischen Eigenschaften des Konstruktionsmusters sowie nach dem Grad des lateinischen Einflusses wird auf diesem Hintergrund präziser formulierbar.
In order to automatically extract opinion holders, we propose to harness the contexts of prototypical opinion holders, i.e. common nouns, such as experts or analysts, that describe particular groups of people whose profession or occupation is to form and express opinions towards specific items. We assess their effectiveness in supervised learning where these contexts are regarded as labelled training data and in rule-based classification which uses predicates that frequently co-occur with mentions of the prototypical opinion holders. Finally, we also examine in how far knowledge gained from these contexts can compensate the lack of large amounts of labeled training data in supervised learning by considering various amounts of actually labeled training sets.