Volltext-Downloads (blau) und Frontdoor-Views (grau)

Bootstrapping polarity classifiers with rule-based classification

  • In this article, we examine the effectiveness of bootstrapping supervised machine-learning polarity classifiers with the help of a domain-independent rule-based classifier that relies on a lexical resource, i.e., a polarity lexicon and a set of linguistic rules. The benefit of this method is that though no labeled training data are required, it allows a classifier to capture in-domain knowledge by training a supervised classifier with in-domain features, such as bag of words, on instances labeled by a rule-based classifier. Thus, this approach can be considered as a simple and effective method for domain adaptation. Among the list of components of this approach, we investigate how important the quality of the rule-based classifier is and what features are useful for the supervised classifier. In particular, the former addresses the issue in how far linguistic modeling is relevant for this task. We not only examine how this method performs under more difficult settings in which classes are not balanced and mixed reviews are included in the data set but also compare how this linguistically-driven method relates to state-of-the-art statistical domain adaptation.

Export metadata

Additional Services

Share in Twitter Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Michael WiegandGND, Manfred KlennerGND, Dietrich Klakow
URN:urn:nbn:de:bsz:mh39-84425
DOI:https://doi.org/10.1007/s10579-013-9218-3
ISSN:1574-0218
Parent Title (English):Language Resources and Evaluation
Publisher:Springer
Place of publication:Dordrecht
Document Type:Article
Language:English
Year of first Publication:2013
Date of Publication (online):2019/01/24
Publicationstate:Postprint
Reviewstate:Peer-Review
Tag:Bootstrapping methods; Feature engineering; Polarity classification; Sentiment analysis; Text classification
GND Keyword:Computerlinguistik; Maschinelles Lernen; Natürliche Sprache; Polarität; Text Mining
Volume:47
Issue:4
First Page:1049
Last Page:1088
Note:
This is a post-peer-review, pre-copyedit version of an article published in Language Resources and Evaluation. The final authenticated version is available online at: http://dx.doi.org/10.1007/s10579-013-9218-3
Dewey Decimal Classification:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
Linguistics-Classification:Computerlinguistik
Licence (German):Es gilt das UrhG