Hyperplane Queries in a Feature-Space M-tree for Speeding up Active Learning

In content-based retrieval, relevance feedback (RF) is a noticeable method for reducing the “semantic gap” between the low-level features describing the content and the usually higher-level meaning of user’s target. While recent RF methods based on active learning are able to identify complex target classes after relatively few iterations, they can be quite slow on very large databases. To address this scalability issue for active RF, we put forward a method that consists in the construction of an M-tree in the feature space associated to a kernel function and in performing approximate kNN hyperplane queries with this feature space M-tree. The experiments performed on two image databases show that a significant speedup can be achieved, at the expense of a limited increase in the number of feedback rounds. Keywords—scalability, content-based retrieval, relevance feedback, M-tree, approximate search, hyperplane queries

[1]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.

[2]  Bernhard Schölkopf,et al.  The Kernel Trick for Distances , 2000, NIPS.

[3]  Jing Peng,et al.  Kernel indexing for relevance feedback image retrieval , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[4]  Patrick Haffner,et al.  Support vector machines for histogram-based image classification , 1999, IEEE Trans. Neural Networks.

[5]  Edward Y. Chang Statistical Learning for Effective Visual , 2003 .

[6]  Edward Y. Chang,et al.  Exploiting Geometry for Support Vector Machine Indexing , 2005, SDM.

[7]  Jing Peng,et al.  Kernel VA-files for relevance feedback retrieva , 2003, MMDB '03.

[8]  Edward Y. Chang,et al.  Statistical learning for effective visual information retrieval , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[9]  C. Berg,et al.  Harmonic Analysis on Semigroups , 1984 .

[10]  Edward Y. Chang,et al.  Efficient top-k hyperplane query processing for multimedia information retrieval , 2006, MM '06.

[11]  Marco Patella,et al.  PAC nearest neighbor queries: Approximate and controlled search in high-dimensional and metric spaces , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[12]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[13]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[14]  N. Boujemaa,et al.  Relevance Feedback for Image Retrieval : a Short Survey , 2004 .

[15]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[16]  Marin Ferecatu,et al.  Retrieval of difficult image classes using svd-based relevance feedback , 2004, MIR '04.

[17]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[18]  Arnold W. M. Smeulders,et al.  The Amsterdam Library of Object Images , 2004, International Journal of Computer Vision.

[19]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[20]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[21]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[22]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[23]  Pavel Zezula,et al.  Indexing Metric Spaces with M-Tree , 1997, SEBD.

[24]  Daphne Koller,et al.  Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.

[25]  Edward Y. Chang,et al.  Active learning in very large databases , 2006, Multimedia Tools and Applications.