Efficient Vector Quantization for Fast Approximate Nearest Neighbor Search
Muravev, Anton (2016)
Muravev, Anton
2016
Master's Degree Programme in Information Technology
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2016-08-17
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201608034374
https://urn.fi/URN:NBN:fi:tty-201608034374
Tiivistelmä
Increasing sizes of databases and data stores mean that the traditional tasks, such as locating a nearest neighbor for a given data point, become too complex for classical solutions to handle. Exact solutions have been shown to scale poorly with dimensionality of the data. Approximate nearest neighbor search (ANN) is a practical compromise between accuracy and performance; it is widely applicable and is a subject of much research.
Amongst a number of ANN approaches suggested in the recent years, the ones based on vector quantization stand out, achieving state-of-the-art results. Product quantization (PQ) decomposes vectors into subspaces for separate processing, allowing for fast lookup-based distance calculations. Additive quantization (AQ) drops most of PQ constraints, currently providing the best search accuracy on image descriptor datasets, but at a higher computational cost. This thesis work aims to reduce the complexity of AQ by changing a single most expensive step in the process – that of vector encoding. Both the outstanding search performance and high costs of AQ come from its generality, therefore by imposing some novel external constraints it is possible to achieve a better compromise: reduce complexity while retaining the accuracy advantage over other ANN methods.
We propose a new encoding method for AQ – pyramid encoding. It requires significantly less calculations compared to the original “beam search” encoding, at the cost of an increased greediness of the optimization procedure. As its performance depends heavily on the initialization, the problem of choosing a starting point is also discussed. The results achieved by applying the proposed method are compared with the current state-of-the-art on two widely used benchmark datasets – GIST1M and SIFT1M, both generated from a real-world image data and therefore closely modeling practical applications. AQ with pyramid encoding, in addition to its computational benefits, is shown to achieve similar or better search performance than competing methods. However, its current advantages seem to be limited to data of a certain internal structure. Further analysis of this drawback provides us with the directions of possible future work.
Amongst a number of ANN approaches suggested in the recent years, the ones based on vector quantization stand out, achieving state-of-the-art results. Product quantization (PQ) decomposes vectors into subspaces for separate processing, allowing for fast lookup-based distance calculations. Additive quantization (AQ) drops most of PQ constraints, currently providing the best search accuracy on image descriptor datasets, but at a higher computational cost. This thesis work aims to reduce the complexity of AQ by changing a single most expensive step in the process – that of vector encoding. Both the outstanding search performance and high costs of AQ come from its generality, therefore by imposing some novel external constraints it is possible to achieve a better compromise: reduce complexity while retaining the accuracy advantage over other ANN methods.
We propose a new encoding method for AQ – pyramid encoding. It requires significantly less calculations compared to the original “beam search” encoding, at the cost of an increased greediness of the optimization procedure. As its performance depends heavily on the initialization, the problem of choosing a starting point is also discussed. The results achieved by applying the proposed method are compared with the current state-of-the-art on two widely used benchmark datasets – GIST1M and SIFT1M, both generated from a real-world image data and therefore closely modeling practical applications. AQ with pyramid encoding, in addition to its computational benefits, is shown to achieve similar or better search performance than competing methods. However, its current advantages seem to be limited to data of a certain internal structure. Further analysis of this drawback provides us with the directions of possible future work.