Automatic Food Categorization from Large Unlabeled Corpora and Its Impact on Relation Extraction
- We present a weakly-supervised induction method to assign semantic information to food items. We consider two tasks of categorizations being food-type classification and the distinction of whether a food item is composite or not. The categorizations are induced by a graph-based algorithm applied on a large unlabeled domain-specific corpus. We show that the usage of a domain-specific corpus is vital. We do not only outperform a manually designed open-domain ontology but also prove the usefulness of these categorizations in relation extraction, outperforming state-of-the-art features that include syntactic information and Brown clustering.
Author: | Michael WiegandGND, Benjamin Roth, Dietrich Klakow |
---|---|
URN: | urn:nbn:de:bsz:mh39-84696 |
URL: | https://aclanthology.info/papers/E14-1071/e14-1071 |
DOI: | https://doi.org/10.3115/v1/E14-1071 |
ISBN: | 978-1-937284-78-7 |
Parent Title (English): | Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, April 26-30, 2014, Gothenburg, Sweden |
Publisher: | Association for Computational Linguistics |
Place of publication: | Stroudsburg, PA |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2014 |
Date of Publication (online): | 2019/02/05 |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | Peer-Review |
GND Keyword: | Computerlinguistik; Korpus <Linguistik>; Lebensmittel; Maschinelles Lernen; Text Mining |
First Page: | 673 |
Last Page: | 682 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
Linguistics-Classification: | Computerlinguistik |
Licence (English): | ![]() |