SSD Technology Enables Dynamic Maintenance of Persistent High-Dimensional Indexes

In today's world of ever-increasing multimedia collections, dynamically and persistently maintaining high-dimensional indexes is imperative for industrial applications. Since HDD performance is the main bottleneck in index maintenance, we investigate the impact of SSD technology. We use the NV-tree to drive our analysis, as the only high-dimensional index in the literature which has seriously addressed updates. Our simulation model indicates that an index of 1.5 billion descriptors can be built dynamically on a high-end SSD in just over four hours of disk time, which is more than 500x faster than using a high-end HDD. Relatively small investment in the new SSD technology can thus make dynamic and persistent high-dimensional indexes very feasible.

[1]  Laurent Amsaleg,et al.  Dynamic behavior of balanced NV-trees , 2008, 2008 International Workshop on Content-Based Multimedia Indexing.

[2]  Laurent Amsaleg,et al.  NV-Tree: nearest neighbors at the billion scale , 2011, ICMR '11.

[3]  Changhu Wang,et al.  Indexing billions of images for sketch-based retrieval , 2013, ACM Multimedia.

[4]  Laurent Amsaleg,et al.  NV-Tree: An Efficient Disk-Based Index for Approximate Search in Very Large High-Dimensional Collections , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Cordelia Schmid,et al.  Evaluation of GIST descriptors for web-scale image search , 2009, CIVR '09.

[6]  Ting Liu,et al.  Clustering Billions of Images with Large Scale Nearest Neighbor Search , 2007, 2007 IEEE Workshop on Applications of Computer Vision (WACV '07).

[7]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.