Using a domain ontology for the semantic-statistical classification of specialist hypertexts
- In this feasibility study we aim at contributing at the practical use of domain ontologies for hypertext classification by introducing an algorithm generating potential keywords. The algorithm uses structural markup information and lemmatized word lists as well as a domain ontology on linguistics. We present the calculation and ranking of keyword candidates based on ontology relationships, word position, frequency information, and statistical significance as evidenced by log-likelihood tests. Finally, the results of our machine-driven classification are validated empirically against manually assigned keywords.
Author: | Roman SchneiderGND, Noah Bubenhofer |
---|---|
URN: | urn:nbn:de:bsz:mh39-39840 |
URL: | http://www.dialog-21.ru/en/digest/2010/ |
ISBN: | 5-7281-1148-1 |
Parent Title (English): | Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference “Dialogue” / Komp'juternaja lingvistika i intellektual'nye tehnologii. Po materialam ezhegodnoj Mezhdunarodnoj konferencii „Dialog“, Bekasovo, 26.-30. Mai 2010 |
Place of publication: | Moskva |
Document Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2010 |
Date of Publication (online): | 2015/08/17 |
Creating Corporation: | Izdatel’stvo Rossijskogo Gosudarstvennogo Gumanitarnogo Universiteta (RGGU) / Staatliche Universität für Sozial- und Geisteswissenschaften Moskau (RGGU) |
GND Keyword: | Deutsch; Grammatik; Linguistische Datenverarbeitung; Semantisches Netz; Wissenspräsentation |
First Page: | 622 |
Last Page: | 628 |
DDC classes: | 400 Sprache / 430 Deutsch |
Open Access?: | ja |
Leibniz-Classification: | Sprache, Linguistik |
Linguistics-Classification: | Computerlinguistik |
Licence (German): | ![]() |