Spotting, collecting and documenting negative polarity items

As the nature of negative polarity items (NPIs) and their licensing contexts is still under much debate, a broad empirical basis is an important cornerstone to support further insights in this area of research. The work discussed in this paper is intended as a contribution to realizing this objective. The authors briefly introduce the phenomenon of NPIs and outline major theories about their licensing and also various licensing contexts before discussing our major topics: Firstly, a corpus-based retrieval method for NPI candidates is described that ranks the candidates according to their distributional dependence on the licensing contexts. Our method extracts single-word candidates and is extended to also capture multi-word candidates. The basic idea for automatically collecting NPI candidates from a large corpus is that an NPI behaves like a kind of collocate to its licensing contexts. Manual inspection and interpretation of the candidate lists identify the actual NPIs. Secondly, an online repository for NPIs and other items that show distributional idiosyncrasies is presented, which offers an empirical database for further (theoretical) research on these items in a sustainable way.

Metadaten
Author:	Jan-Philipp Soehn, Beata Trawiński, Timm Lichte
URN:	urn:nbn:de:bsz:mh39-34434
DOI:	https://doi.org/10.1007/s11049-011-9125-5
ISSN:	1573-0859
Parent Title (English):	Natural Language and Linguistic Theory
Document Type:	Article
Language:	English
Year of first Publication:	2010
Date of Publication (online):	2015/01/30
Publicationstate:	Postprint
Reviewstate:	Peer-review
Tag:	Corpus-based retrieval; Documentation; Empirical database; Polarity items; XML
GND Keyword:	Deutsch; Englisch; Korpus <Linguistik>; Negativer Polaritätsausdruck
Volume:	28
Issue:	4
First Page:	931
Last Page:	952
Note:	The final publication is available at Springer via http://dx.doi.org/10.1007/s11049-011-9125-5
DDC classes:	400 Sprache / 410 Linguistik / 410 Linguistik
Open Access?:	ja
Linguistics-Classification:	Computerlinguistik
Licence (German):	Urheberrechtlich geschützt

Open Access