Data-driven Knowledge Extraction for the Food Domain
- In this paper, we examine methods to automatically extract domain-specific knowledge from the food domain from unlabeled natural language text. We employ different extraction methods ranging from surface patterns to co-occurrence measures applied on different parts of a document. We show that the effectiveness of a particular method depends very much on the relation type considered and that there is no single method that works equally well for every relation type. We also examine a combination of extraction methods and also consider relationships between different relation types. The extraction methods are applied both on a domain-specific corpus and the domain-independent factual knowledge base Wikipedia. Moreover, we examine an open-domain lexical ontology for suitability.
Author: | Michael WiegandGND, Benjamin Roth, Dietrich Klakow |
---|---|
URN: | urn:nbn:de:bsz:mh39-84529 |
URL: | http://www.oegai.at/konvens2012/proceedings.shtml |
ISBN: | 3-85027-005-X |
Parent Title (English): | Proceedings of the 11th Conference on Natural Language Processing (KONVENS 2012). Empirical Methods in Natural Language Processing, September 19-21, 2012, Vienna, Austria |
Series (Serial Number): | Schriftenreihe der Österreichischen Gesellschaft für Artificial Intelligence (ÖGAI) (Band 5) |
Publisher: | Österreichische Gesellschaft für Artificial Intelligence |
Place of publication: | Wien |
Editor: | Jeremy Jancsary |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2012 |
Date of Publication (online): | 2019/01/28 |
Publicationstate: | Veröffentlichungsversion |
Reviewstate: | Peer-Review |
GND Keyword: | Computerlinguistik; Empirische Linguistik; Information Extraction; Korpus <Linguistik>; Lebensmittel |
First Page: | 21 |
Last Page: | 29 |
DDC classes: | 400 Sprache / 400 Sprache, Linguistik |
Open Access?: | ja |
Linguistics-Classification: | Computerlinguistik |
Licence (German): | ![]() |