Refine
Year of publication
Document Type
- Conference Proceeding (62)
- Part of a Book (19)
- Article (13)
- Working Paper (2)
- Book (1)
- Doctoral Thesis (1)
Keywords
- Automatische Sprachanalyse (29)
- Deutsch (20)
- Korpus <Linguistik> (17)
- Computerlinguistik (16)
- Frame-Semantik (15)
- Annotation (14)
- Semantische Analyse (14)
- Propositionale Einstellung (10)
- Beleidigung (9)
- sentiment analysis (9)
Publicationstate
- Veröffentlichungsversion (75)
- Zweitveröffentlichung (8)
- Postprint (5)
Reviewstate
Publisher
- Association for Computational Linguistics (13)
- European Language Resources Association (10)
- The Association for Computational Linguistics (9)
- German Society for Computational Linguistics & Language Technology und Friedrich-Alexander-Universität Erlangen-Nürnberg (4)
- Springer (4)
- European Language Resources Association (ELRA) (3)
- European language resources association (ELRA) (3)
- Oxford University Press (3)
- Universität Hildesheim (3)
- EACL (2)
We provide a unified account of semantic effects observable in attested examples of the German applicative (‘be-’) construction, e.g. Rollstuhlfahrer Poul Sehachsen aus Kopenhagen will den 1997 erschienenen Wegweiser Handiguide Europa fortführen und zusammen mit Movado Berlin berollen (‘Wheelchair user Poul Schacksen from Copenhagen wants to continue the guide ‘Handiguide Europe’, which came out in 1997, and roll Berlin together with Movado.’). We argue that these effects do not come from lexico-semantic operations on ‘input’ verbs, but are instead the products of a reconciliation procedure in which the meaning of the verb is integrated into the event-structure schema denoted by the applicative construction. We analyze the applicative pattern as an argument-structure construction, in terms of Goldberg (1995). We contrast this approach with that of Brinkmann (1997), in which properties associated with the applicative pattern (e.g. omissibility of the theme argument, holistic interpretation of the goal argument, and planar construal of the location argument) are attributed to general semantico-pragmatic principles. We undermine the generality of the principles as stated, and assert that these properties are instead construction-particular. We further argue that the constructional account provides an elegant model of the valence-creation and valence-augmentation functions of the prefix. We describe the constructional semantics as prototype-based: diverse implications of fee-predications, including iteration, transfer, affectedness, intensity and saturation, derive via regular patterns of semantic extension from the topological concept of coverage.
We provide a unified account of semantic effects observable in attested examples of the German applicative (‘be-’) construction, e.g. Rollstuhlfahrer Poul Sehachsen aus Kopenhagen will den 1997 erschienenen Wegweiser Handiguide Europa fortführen und zusammen mit Movado Berlin berollen (‘Wheelchair user Poul Schacksen from Copenhagen wants to continue the guide ‘Handiguide Europe’, which came out in 1997, and roll Berlin together with Movado.’). We argue that these effects do not come from lexico-semantic operations on ‘input’ verbs, but are instead the products of a reconciliation procedure in which the meaning of the verb is integrated into the event-structure schema denoted by the applicative construction. We analyze the applicative pattern as an argument-structure construction, in terms of Goldberg (1995). We contrast this approach with that of Brinkmann (1997), in which properties associated with the applicative pattern (e.g. omissibility of the theme argument, holistic interpretation of the goal argument, and planar construal of the location argument) are attributed to general semantico-pragmatic principles. We undermine the generality of the principles as stated, and assert that these properties are instead construction-particular. We further argue that the constructional account provides an elegant model of the valence-creation and valence-augmentation functions of the prefix. We describe the constructional semantics as prototype-based: diverse implications of fee-predications, including iteration, transfer, affectedness, intensity and saturation, derive via regular patterns of semantic extension from the topological concept of coverage.
When a noise verb is used to indicate verbal communication, factors from both the source domain of the verb (perception) and the target domain (communication) play a role in determining the argument structure of the sentence. While the target domain supplies a syntactic structure, the source domain’s semantics constrain the degree to which that syntactic structure can be exploited. This can be determined by comparing noise verbs in this use with manner-of-communication verbs, which are superficially similar, but native to communication. Data for these two classes of verbs were drawn from the British National Corpus. The data were annotated with frame-semantic markup, as described in the Berkeley FrameNet Project. We compared the presence, type of syntactic realization, and position of the semantically annotated arguments for both classes of verbs. We found that noise and manner verbs show statistically significant differences in these three areas. For instance, noise verbs are more focused on the form of the message than manner verbs: noise verbs appear more frequently with a quoted message. In addition, there are differences other than the complementation patterns: certain noise verbs are biased with respect to speakers’ genders, message types, and even orthography in quoted messages
Alternations play a central role in most current theories of verbal argument structure, wich are devides primarily to model the syntactic flexibility of verbs. Accordingly, these frameworks take verbs, and their projection properties, to be the sole contributors of thematic content to the clause. Approached from this perspective, the German applicative (or be-prefix) construction has puzzling properties. First, while many applicative verbs have transparent base forms, many, including those coined from nouns, do not. Second, applicative verbs are bound by interpretive and argument-realization conditions which cannot be traced to their base forms, if any. These facts suggest that applicative formation is not appropriately modeled as a lexical rule.
Using corpus data from a diverse array of genres, Michaelis and Ruppenhofer propose a unified solution to these two puzzles within the framework of Construction Grammar. Central to this account is the concept of valence augmentation: argument-structure constructions denote event types, and therefore license valence sets which may properly include those of their lexical fillers. As per Panini's Law, resolution of valence mismatch favors the construction over the verb. Like verbs of transfer and location, the applicative construction has a prototype-based event-structure representation: diverse implications of applicative predications--including iteration, transfer, affectedness, intensity and saturation--are shown to derive via regular patterns of semantic extension from the topological concept of coverage.
The FrameNet lexical database yields information about collocations and multiword expressions in various ways. In some cases phrasal units have been entered from the start as lexical entries (write down). In other cases headword + preposition pairs can be recognized as special collocations Where the preposition in question is a necessary and lexically specified marker of an argument of the headword + fond of, hostile to). Nominal compounds are annotated with respect to noun or (pertinative) adjective modifiers, some of which are analyzable but also entrenched (wheel chair, fiscal year). Nouns that name aggregates, portions, types, etc., sometimes hold lexically specified relations to their dependents (flock of geese). And event nouns frequently Select the support verbs which permit them to enter into predications (file an objection, enter a plea). A subproject aims at extracting, as structured clusters of lexical items, the minimal semantically central kernel dependency graphs from the set of annotations. Such research will yield not only commonplace groupings (eat: dog, bone) but will also yield hitherto unnoticed collocations within such graphs (answer: you, door) where certain dependency links within them are idiomatic or otherwise lexically special, here answer > door. Collocational information can also be retrieved by various types of queries within our MySQL search tool
The classification of verbs in Levin's (1993) English Verb Classes and Alternations: A preliminary Investigation, on the basis of both intuitive semantic grouping and their participation in valence alternations, is often used by the NLP community as evidence of the semantic similarity of verbs (Jing & McKeown 1998; Lapata & Brew 1999; Kohl et al. 1998). In this paper, we compare the Levin classification with the work of the FrameNet project (Fillmore & Baker 2001), where words (not just verbs) are grouped according to the conceptual structures (frames) that underlie them and their combinatorial patterns are inductively derived from corpus evidence. This means that verbs grouped together in FrameNet (FN) might be semantically similar but have different (or no) alternations, and that verbs which share the same alternation might be represented in two different semantic frames.
This dissertation investigates discourse-pragmatic differences between variably linked arguments appearing in alternating argument structure constructions in the sense of Goldberg (1995) and Kay (manuscript). The properties that are studied include givenness, pragmatic relation (topic/focus), salience of referents, animacy, and others. They derive from the literature on sentence-type constructions such as topicalization and from research on the referential properties of NP form types.
The research carried out here has multiple uses. At the most basic level, it serves as an empirical check on existing characterizations of the pragmatic properties of the relevant arguments that are the result of syntactic and semantic analysis based on introspection alone. For instance, for the epistemic raising alternation involving verbs like seem, the predicted topicality difference between the subjects of the raised and unraised constructions (Langacker 1995) could not be confirmed.
This dissertation also addresses the question what kinds of pragmatic factors, if any, are relevant to argument structure constructions. Based on the evidence of the dative alternation, it does not seem to be the case that the kind of pragmatic influences on argument structure constructions are different or limited compared to the ones found to be relevant to sentence-type constructions.
The kind of research undertaken here can also inform the syntactic and semantic analysis of constructions. In the case of the dative alternation, the discourse-pragmatic characteristics of the variably linked arguments provide evidence that Basilico’s (1998) analysis of the difference between the alternates in terms of VP-shells and a difference between thetic and categorical ‘inner’ predication, on the one hand does not account for all the data and on the other can be re-stated in pragmatic terms other than the thetic-categorical distinction.
In addition to studies of valence alternations, this dissertation also discusses various null instantiation phenomena, which provide further evidence for the need to specify discourse-pragmatic properties as part of argument structure constructions and lexical entries.
Finally, it is suggested that the use of randomly sampled corpus data and statistical modelling throughout this dissertation improves both empirical and analytical coverage.
Reframing FrameNet Data
(2004)
The Berkeley FrameNet Project (http://www.icsi.berkeley.edu/~framenet) is building an on-line lexical resource for contemporary English. The database provides information about the semantic and syntactic combinatorial possibilities (valences) of each item analyzed. This paper describes the conceptual basis for what has been called reframing of data in the FrameNet database and exemplifies two new frame-to-frame relations, Causative_of and Inchoative_of, the implementation of which came about as a result of reanalysis of certain frames and lexical units. The new relations are characterized with respect to a triple of frames involving the notion of attaching, and entering them into the database is demonstrated using the Frame Relations Editor. The two relations allow FrameNet to make frame-wise distinctions that capture fairly systematic semantic relationships across sets of lexical units. While the Inheritance and Subframe relations are of particular interest to the NLP research community, Causative_of and Inchoative_of may be more relevant to lexicography.
This work proposes opinion frames as a representation of discourse-level associations that arise from related opinion targets and which are common in task-oriented meeting dialogs. We define the opinion frames and explain their interpretation. Additionally we present an annotation scheme that realizes the opinion frames and via human annotation studies, we show that these can be reliably identified.
This work proposes opinion frames as a representation of discourse-level associations which arise from related opinion topics. We illustrate how opinion frames help gather more information and also assist disambiguation. Finally we present the results of our experiments to detect these associations.
As many popular text genres such as blogs or news contain opinions by multiple sources and about multiple targets, finding the sources and targets of subjective expressions becomes an important sub-task for automatic opinion analysis systems. We argue that while automatic semantic role labeling systems (ASRL) have an important contribution to make, they cannot solve the problem for all cases. Based on the experience of manually annotating opinions, sources, and targets in various genres, we present linguistic phenomena that require knowledge beyond that of ASRL systems. In particular, we address issues relating to the attribution of opinions to sources; sources and targets that are realized as zero-forms; and inferred opinions. We also discuss in some depth that for arguing attitudes we need to be able to recover propositions and not only argued-about entities. A recurrent theme of the discussion is that close attention to specific discourse contexts is needed to identify sources and targets correctly.
We present MaJo, a toolkit for supervised Word Sense Disambiguation (WSD), with an interface for Active Learning. Our toolkit combines a flexible plugin architecture which can easily be extended, with a graphical user interface which guides the user through the learning process. MaJo integrates off-the-shelf NLP tools like POS taggers, treebank-trained statistical parsers, as well as linguistic resources like WordNet and GermaNet. It enables the user to systematically explore the benefit gained from different feature types for WSD. In addition, MaJo provides an Active Learning environment, where the
system presents carefully selected instances to a human oracle. The toolkit supports manual annotation of the selected instances and re-trains the system on the extended data set. MaJo also provides the means to evaluate the performance of the system against a gold standard. We illustrate the usefulness of our system by learning the frames (word senses) for three verbs from the SALSA corpus, a version of the TiGer treebank with an additional layer of frame-semantic annotation. We show how MaJo can be used to tune the feature set for specific target words and so improve performance for these targets. We also show that syntactic features, when carefully tuned to the target word, can lead to a substantial increase in performance.
In the paper we investigate the impact of data size on a Word Sense Disambiguation task (WSD). We question the assumption that the knowledge acquisition bottleneck, which is known as one of the major challenges for WSD, can be solved by simply obtaining more and more training data. Our case study on 1,000 manually annotated instances of the German verb drohen (threaten) shows that the best performance is not obtained when training on the full data set, but by carefully selecting new training instances with regard to their informativeness for the learning process (Active Learning). We present a thorough evaluation of the impact of different sampling methods on the data sets and propose an improved method for uncertainty sampling which dynamically adapts the selection of new instances to the learning progress of the classifier, resulting in more robust results during the initial stages of learning. A qualitative error analysis identifies problems for automatic WSD and discusses the reasons for the great gap in performance between human annotators and our automatic WSD system.
Historical cabinet protocols are a useful resource which enable historians to identify the opinions expressed by politicians on different subjects and at different points of time. While cabinet protocols are often available in digitized form, so far the only method to access their information content is by keyword-based search, which often returns sub-optimal results. We present a method for enriching German cabinet protocols with information about the originators of statements. This requires automatic speaker attribution. In order to avoid costly manual annotation of training data, we design a rule-based system which exploits morpho-syntactic cues. Unlike many other approaches, our method can also deal with cases in which the speaker is not explicitly identified in the sentence itself. This is an important capability as 45% of all sentences in the data constitute reported speech whose speakers are not explicitly marked. Our system is able to detect implicit speakers by taking into account signals of speaker continuity. We show that such a system obtains good results, especially with respect to recall which is particularly important for information access.
Active learning has been applied to different NLP tasks, with the aim of limiting the amount of time and cost for human annotation. Most studies on active learning have only simulated the annotation scenario, using prelabelled gold standard data. We present the first active learning experiment for Word Sense Disambiguation with human annotators in a realistic environment, using fine-grained sense distinctions, and investigate whether AL can reduce annotation cost and boost classifier performance when applied to a real-world task.