On Approximately Searching for Similar Word Embeddings

Kohei Sugawara, Hayato Kobayashi and Masajiro Iwasaki

ACL2016 (the annual meeting of the Association for Computational Linguistics), to appear, 2016/8


Natural Language Processing Information Retrieval Machine Learning

We discuss an approximate similarity search for word embeddings, which is an operation to approximately find embeddings close to a given vector. We compared several metric-based search algorithms with hash-, tree-, and graph- based indexing from different aspects. Our experimental results showed that a graph-based indexing exhibits robust performance and additionally provided useful information, e.g., vector normalization achieves an efficient search with cosine similarity.

On Approximately Searching for Similar Word Embeddings(External Site Link)