Lecture Notes in Computer Science
Refine
Document Type
Language
- English (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Keywords
- Semantik (2)
- Veröffentlichung (2)
- author name disambiguation (2)
- semantic similarity (2)
- word embeddings (2)
- API (1)
- Automatische Sprachanalyse (1)
- Computerlinguistik (1)
- Datenbank (1)
- Deep learning (1)
Publicationstate
- Postprint (2)
- Zweitveröffentlichung (2)
Reviewstate
- Peer-Review (2)
Publisher
- Springer (2)
10450
We present a supervised machine learning AND system which tackles semantic similarity between publication titles by means of word embeddings. Word embeddings are integrated as external components, which keeps the model small and efficient, while allowing for easy extensibility and domain adaptation. Initial experiments show that word embeddings can improve the Recall and F score of the binary classification sub-task of AND. Results for the clustering sub-task are less clear, but also promising and overall show the feasibility of the approach.
11799
The demo presents a minimalist, off-the-shelf AND tool which provides a fundamental AND operation, the comparison of two publications with ambiguous authors, as an easily accessible HTTP interface. The tool implements this operation using standard AND functionality, but puts particular emphasis on advanced methods from natural language processing (NLP) for comparing publication title semantics.