TY - CHAP U1 - Konferenzveröffentlichung A1 - Müller, Mark-Christoph A1 - Strube, Michael ED - Zhao, Dongyan T1 - Transparent, efficient, and robust word embedding access with WOMBAT T2 - Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations. August 20-26, 2018, Santa Fe, New Mexico, USA N2 - We present WOMBAT, a Python tool which supports NLP practitioners in accessing word embeddings from code. WOMBAT addresses common research problems, including unified access, scaling, and robust and reproducible preprocessing. Code that uses WOMBAT for accessing word embeddings is not only cleaner, more readable, and easier to reuse, but also much more efficient than code using standard in-memory methods: a Python script using WOMBAT for evaluating seven large word embedding collections (8.7M embedding vectors in total) on a simple SemEval sentence similarity task involving 250 raw sentence pairs completes in under ten seconds end-to-end on a standard notebook computer. KW - Python KW - Automatische Sprachanalyse KW - Code KW - Computerlinguistik KW - word embedding KW - WOrd eMBedding dATabase (WOMBAT) Y1 - 2018 U6 - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-110862 UN - https://nbn-resolving.org/urn:nbn:de:bsz:mh39-110862 UR - https://aclanthology.org/C18-2012 SN - 978-1-948087-53-7 SB - 978-1-948087-53-7 SP - 53 EP - 57 PB - Association for Computational Linguistics CY - Stroudsburg, Pennsylvania ER -