Scalable Semi-Supervised Learning by Efficient Anchor Graph Regularization

Many graph-based semi-supervised learning methods for large datasets have been proposed to cope with the rapidly increasing size of data, such as Anchor Graph Regularization (AGR). This model builds a regularization framework by exploring the underlying structure of the whole dataset with both datapoints and anchors. Nevertheless, AGR still has limitations in its two components: (1) in anchor graph construction, the estimation of the local weights between each datapoint and its neighboring anchors could be biased and relatively slow; and (2) in anchor graph regularization, the adjacency matrix that estimates the relationship between datapoints, is not sufficiently effective. In this paper, we develop an Efficient Anchor Graph Regularization (EAGR) by tackling these issues. First, we propose a fast local anchor embedding method, which reformulates the optimization of local weights and obtains an analytical solution. We show that this method better reconstructs datapoints with anchors and speeds up the optimizing process. Second, we propose a new adjacency matrix among anchors by considering the commonly linked datapoints, which leads to a more effective normalized graph Laplacian over anchors. We show that, with the novel local weight estimation and normalized graph Laplacian, EAGR is able to achieve better classification accuracy with much less computational costs. Experimental results on several publicly available datasets demonstrate the effectiveness of our approach.

[1]  Hanqing Lu,et al.  Semi-supervised multi-graph hashing for scalable similarity search , 2014, Comput. Vis. Image Underst..

[2]  Xuelong Li,et al.  Detecting Densely Distributed Graph Patterns for Fine-Grained Image Categorization , 2016, IEEE Transactions on Image Processing.

[3]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[4]  James T. Kwok,et al.  Prototype vector machine for large scale semi-supervised learning , 2009, ICML '09.

[5]  Nicu Sebe,et al.  Optimal graph learning with partial tags and multiple features for image and video annotation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Benoit Huet,et al.  Concept detector refinement using social videos , 2010, VLS-MCMR '10.

[7]  Xinlei Chen,et al.  Large Scale Spectral Clustering with Landmark-Based Representation , 2011, AAAI.

[8]  Chun Chen,et al.  Graph-based local concept coordinate factorization , 2013, Knowledge and Information Systems.

[9]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  B. Mohar THE LAPLACIAN SPECTRUM OF GRAPHS y , 1991 .

[11]  Antonio Torralba,et al.  Semi-Supervised Learning in Gigantic Image Collections , 2009, NIPS.

[12]  Mikhail Belkin,et al.  Semi-Supervised Learning on Riemannian Manifolds , 2004, Machine Learning.

[13]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[14]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[16]  Erkki Oja,et al.  Clustering by Low-Rank Doubly Stochastic Matrix Decomposition , 2012, ICML.

[17]  Kristen Grauman,et al.  Watch, Listen & Learn: Co-training on Captioned Images and Videos , 2008, ECML/PKDD.

[18]  James T. Kwok,et al.  Scaling Up Graph-Based Semisupervised Learning via Prototype Vector Machines , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Vittorio Castelli,et al.  On the exponential value of labeled samples , 1995, Pattern Recognit. Lett..

[20]  Rui Kuang,et al.  Global Linear Neighborhoods for Efficient Label Propagation , 2012, SDM.

[21]  Chun Chen,et al.  EMR: A Scalable Graph-Based Ranking Model for Content-Based Image Retrieval , 2015, IEEE Transactions on Knowledge and Data Engineering.

[22]  Cordelia Schmid,et al.  Multimodal semi-supervised learning for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Seiji Yamada,et al.  Semisupervised Query Expansion with Minimal Feedback , 2007, IEEE Transactions on Knowledge and Data Engineering.

[24]  David J. Fleet,et al.  Fast Exact Search in Hamming Space With Multi-Index Hashing , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Charles Elkan,et al.  Using the Triangle Inequality to Accelerate k-Means , 2003, ICML.

[26]  Zili Zhang,et al.  Semi-supervised classification based on subspace sparse representation , 2013, Knowledge and Information Systems.

[27]  Meng Wang,et al.  Unified Video Annotation via Multigraph Learning , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Deli Zhao,et al.  Face Recognition via Archetype Hull Ranking , 2013, 2013 IEEE International Conference on Computer Vision.

[29]  Wei Liu,et al.  Large Graph Construction for Scalable Semi-Supervised Learning , 2010, ICML.

[30]  Rongrong Ji,et al.  Visual Reranking through Weakly Supervised Multi-graph Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[31]  Shashi Shekhar,et al.  A semi-supervised learning method for remote sensing data mining , 2005, 17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05).

[32]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[33]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[34]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[35]  Michael William Newman,et al.  The Laplacian spectrum of graphs , 2001 .

[36]  Mikhail Belkin,et al.  Laplacian Support Vector Machines Trained in the Primal , 2009, J. Mach. Learn. Res..

[37]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[38]  Zhi-Hua Zhou,et al.  Semisupervised Regression with Cotraining-Style Algorithms , 2007, IEEE Transactions on Knowledge and Data Engineering.

[39]  Seungjin Choi,et al.  Multi-view anchor graph hashing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[40]  Francisco Herrera,et al.  Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study , 2015, Knowledge and Information Systems.

[41]  Jing Wang,et al.  Scalable k-NN graph construction for visual descriptors , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Xinlei Chen,et al.  Large Scale Spectral Clustering Via Landmark-Based Sparse Representation , 2015, IEEE Transactions on Cybernetics.

[43]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[44]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[45]  Zi Huang,et al.  Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval , 2013, IEEE Transactions on Multimedia.

[46]  Lei Huang,et al.  Online semi-supervised annotation via proxy-based local consistency propagation , 2015, Neurocomputing.

[47]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[48]  Feiping Nie,et al.  Large-scale adaptive semi-supervised learning via unified inductive and transductive model , 2014, KDD.

[49]  Rongrong Ji,et al.  Weakly Supervised Multi-Graph Learning for Robust Image Reranking , 2014, IEEE Transactions on Multimedia.

[50]  Hongxun Yao,et al.  Affective Image Retrieval via Multi-Graph Learning , 2014, ACM Multimedia.

[51]  Rynson W. H. Lau,et al.  Knowledge and Data Engineering for e-Learning Special Issue of IEEE Transactions on Knowledge and Data Engineering , 2008 .

[52]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[53]  Zhi-Hua Zhou,et al.  Semi-Supervised Regression with Co-Training Style Algorithms , 2007 .

[54]  Mario Vento,et al.  Graph Matching and Learning in Pattern Recognition in the Last 10 Years , 2014, Int. J. Pattern Recognit. Artif. Intell..

[55]  Christos Faloutsos,et al.  Estimating robustness in large social graphs , 2014, Knowledge and Information Systems.

[56]  Min Yang,et al.  Robust Discriminative Tracking via Landmark-Based Label Propagation , 2015, IEEE Transactions on Image Processing.

[57]  Wei Liu,et al.  Robust and Scalable Graph-Based Semisupervised Learning , 2012, Proceedings of the IEEE.

[58]  Tommi S. Jaakkola,et al.  Partially labeled classification with Markov random walks , 2001, NIPS.