On Approximately Searching for Similar Word Embeddings

Kohei Sugawara, Hayato Kobayashi and Masajiro Iwasaki

ACL2016 (the annual meeting of the Association for Computational Linguistics), to appear, 2016/8


Natural Language Processing Information Retrieval Machine Learning

We discuss an approximate similarity search for word embeddings, which is an operation to approximately find embeddings close to a given vector. We compared several metric-based search algorithms with hash-, tree-, and graph- based indexing from different aspects. Our experimental results showed that a graph-based indexing exhibits robust performance and additionally provided useful information, e.g., vector normalization achieves an efficient search with cosine similarity.

Poster Download (3.2MB)

On Approximately Searching for Similar Word Embeddings(External Site Link)