Laplacian Regularized D-Optimal Design for Active Learning and Its Application to Image Retrieval

In increasingly many cases of interest in computer vision and pattern recognition, one is often confronted with the situation where data size is very large. Usually, the labels are expensive and the challenge is, thus, to determine which unlabeled samples would be the most informative (i.e., improve the classifier the most) if they were labeled and used as training samples. Particularly, we consider the problem of active learning of a regression model in the context of experimental design. Classical optimal experimental design approaches are based on least square errors over the measured samples only. They fail to take into account the unmeasured samples. In this paper, we propose a novel active learning algorithm which operates over graphs. Our algorithm is based on a graph Laplacian regularized regression model which simultaneously minimizes the least square error on the measured samples and preserves the local geometrical structure of the data space. By constructing a nearest neighbor graph, the geometrical structure of the data space can be described by the graph Laplacian. We discuss how results from the field of optimal experimental design may be used to guide our selection of a subset of data points, which gives us the most amount of information. Experiments demonstrate its superior performance in comparison with conventional algorithms.

[1]  Xiaofei He,et al.  A unified active and semi-supervised learning framework for image compression , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[3]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[4]  Mikhail Belkin,et al.  Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[5]  Xuelong Li,et al.  Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm , 2006, IEEE Transactions on Multimedia.

[6]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[7]  Xuelong Li,et al.  Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[9]  Edward Y. Chang,et al.  Multimodal concept-dependent active learning for image retrieval , 2004, MULTIMEDIA '04.

[10]  Hwann-Tzong Chen,et al.  Semantic manifold learning for image retrieval , 2005, ACM Multimedia.

[11]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[12]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[13]  Michael I. Jordan,et al.  Robust design of biological experiments , 2005, NIPS.

[14]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[15]  Anthony C. Atkinson,et al.  Optimum Experimental Designs , 1992 .

[16]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[17]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[18]  Ayhan Demiriz,et al.  Semi-Supervised Support Vector Machines , 1998, NIPS.

[19]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[20]  Mingjing Li,et al.  Color texture moments for content-based image retrieval , 2002, Proceedings. International Conference on Image Processing.

[21]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[22]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[23]  Jiawei Han,et al.  Spectral regression: a unified subspace learning framework for content-based image retrieval , 2007, ACM Multimedia.

[24]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[25]  W. Näther Optimum experimental designs , 1994 .

[26]  Nicu Sebe,et al.  How to complete performance graphs in content-based image retrieval: add generality and normalize scope , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Jiawei Han,et al.  Learning a Maximum Margin Subspace for Image Retrieval , 2008, IEEE Transactions on Knowledge and Data Engineering.

[28]  Kun Zhou,et al.  Laplacian optimal design for image retrieval , 2007, SIGIR.

[29]  Xuelong Li,et al.  Negative Samples Analysis in Relevance Feedback , 2007, IEEE Transactions on Knowledge and Data Engineering.

[30]  D. Harville Matrix Algebra From a Statistician's Perspective , 1998 .

[31]  B. S. Manjunath,et al.  Texture features and learning similarity , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Jiawei Han,et al.  Regularized regression on image manifold for retrieval , 2007, MIR '07.

[33]  Jiawei Han,et al.  Semi-supervised Discriminant Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[34]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[35]  J. Sherman,et al.  Adjustment of an Inverse Matrix Corresponding to a Change in One Element of a Given Matrix , 1950 .

[36]  Mikhail Belkin,et al.  Manifold Regularization : A Geometric Framework for Learning from Examples , 2004 .

[37]  Nello Cristianini,et al.  Convex Methods for Transduction , 2003, NIPS.

[38]  Rong Jin,et al.  A unified log-based relevance feedback scheme for image retrieval , 2006, IEEE Transactions on Knowledge and Data Engineering.

[39]  Thomas S. Huang,et al.  Modified Fourier Descriptors for Shape Representation - A Practical Approach , 1996 .

[40]  Jinbo Bi,et al.  Active learning via transductive experimental design , 2006, ICML.