Efficient Filter Approximation Using the Earth Mover's Distance in Very Large Multimedia Databases with Feature Signatures

The Earth Mover's Distance, proposed in computer vision as a distance-based similarity model reflecting the human perceptual similarity, has been widely utilized in numerous domains for similarity search applicable on both feature histograms and signatures. While efficiency improvement methods towards the Earth Mover's Distance were frequently investigated on feature histograms, not much work is known to study this similarity model on feature signatures denoting object-specific feature representations. Given a very large multimedia database of features signatures, how can k-nearest-neighbor queries be processed efficiently by using the Earth Mover's Distance? In this paper, we propose an efficient filter approximation technique to lower bound the Earth Mover's Distance on feature signatures by restricting the number of earth flows locally. Extensive experiments on real world data indicate the high efficiency of the proposal, attaining order-of-magnitude query processing time cost reduction for high dimensional feature signatures.

[1]  Christian Beecks,et al.  Distance based similarity models for content based multimedia retrieval , 2013 .

[2]  Leonidas J. Guibas,et al.  A metric for distributions with applications to image databases , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[3]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[4]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[5]  Ira Assent,et al.  Approximation Techniques for Indexing the Earth Mover’s Distance in Multimedia Databases , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[6]  Hans-Peter Kriegel,et al.  Generalizing the Optimality of Multi-step k -Nearest Neighbor Query Processing , 2007, SSTD.

[7]  Anthony K. H. Tung,et al.  Efficient and effective similarity search over probabilistic data based on Earth Mover’s Distance , 2010, The VLDB Journal.

[8]  Hideyuki Tamura,et al.  Textural Features Corresponding to Visual Perception , 1978, IEEE Transactions on Systems, Man, and Cybernetics.

[9]  Robert J. Vanderbei,et al.  Linear Programming: Foundations and Extensions , 1998, Kluwer international series in operations research and management service.

[10]  Christos Faloutsos,et al.  Fast Nearest Neighbor Search in Medical Image Databases , 1996, VLDB.

[11]  Walter W Garvin,et al.  Introduction to Linear Programming , 2018, Linear Programming and Resource Allocation Modeling.

[12]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.

[13]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[14]  Tobias Meisen,et al.  Efficient similarity search using the Earth Mover's Distance for large multimedia databases , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[15]  Pavel Zezula,et al.  Similarity Search: The Metric Space Approach (Advances in Database Systems) , 2005 .

[16]  Thomas Seidl,et al.  A comparative study of similarity measures for content-based multimedia retrieval , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[17]  Michael E. Houle,et al.  Dimensional Testing for Multi-step Similarity Search , 2012, 2012 IEEE 12th International Conference on Data Mining.

[18]  Ira Assent,et al.  Adaptable Distance Functions for Similarity-based Multimedia Retrieval , 2006, Datenbank-Spektrum.

[19]  Thomas Seidl,et al.  Signature Quadratic Form Distance , 2010, CIVR '10.

[20]  Ira Assent,et al.  Efficient EMD-based similarity search in multimedia databases via flexible dimensionality reduction , 2008, SIGMOD Conference.

[21]  Ambuj K. Singh,et al.  Indexing the Earth Mover's Distance Using Normal Distributions , 2011, Proc. VLDB Endow..

[22]  Reynold Cheng,et al.  Earth Mover's Distance based Similarity Search at Scale , 2013, Proc. VLDB Endow..

[23]  Pavel Zezula,et al.  Similarity Search - The Metric Space Approach , 2005, Advances in Database Systems.

[24]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.