Select to Better Learn: Fast and Accurate Deep Learning Using Data Selection From Nonlinear Manifolds

Finding a small subset of data whose linear combination spans other data points, also called column subset selection problem (CSSP), is an important open problem in computer science with many applications in computer vision and deep learning. There are some studies that solve CSSP in a polynomial time complexity w.r.t. the size of the original dataset. A simple and efficient selection algorithm with a linear complexity order, referred to as spectrum pursuit (SP), is proposed that pursuits spectral components of the dataset using available sample points. The proposed non-greedy algorithm aims to iteratively find K data samples whose span is close to that of the first K spectral components of entire data. SP has no parameter to be fine tuned and this desirable property makes it problem-independent. The simplicity of SP enables us to extend the underlying linear model to more complex models such as nonlinear manifolds and graph-based models. The nonlinear extension of SP is introduced as kernel-SP (KSP). The superiority of the proposed algorithms is demonstrated in a wide range of applications.

[1]  Aarti Singh,et al.  Provably Correct Algorithms for Matrix Column Subset Selection with Selectively Sampled Data , 2015, J. Mach. Learn. Res..

[2]  Cheng Li,et al.  Pose-Robust Face Recognition via Deep Residual Equivariant Mapping , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Manfred K. Warmuth,et al.  Leveraged volume sampling for linear regression , 2018, NeurIPS.

[4]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[5]  Mohsen Joneidi,et al.  AI-Enabled Blockchain: An Outlier-Aware Consensus Protocol for Blockchain-Based IoT Networks , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[6]  Terrance E. Boult,et al.  Towards Open Set Deep Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Ming-Hsuan Yang,et al.  Kernel Eigenfaces vs. Kernel Fisherfaces: Face recognition using kernel methods , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[8]  Yi Yang,et al.  Attract or Distract: Exploit the Margin of Open Set , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Nazanin Rahnavard,et al.  Subspace Capsule Network , 2020, AAAI.

[11]  Venkatesan Guruswami,et al.  Optimal column-based low-rank matrix reconstruction , 2011, SODA.

[12]  Mubarak Shah,et al.  Iterative Projection and Matching: Finding Structure-Preserving Representatives and Its Application to Computer Vision , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Mohit Singh,et al.  Proportional Volume Sampling and Approximation Algorithms for A-Optimal Design , 2018, SODA.

[14]  Luis Rademacher,et al.  Efficient Volume Sampling for Row/Column Subset Selection , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[15]  Yuxiao Hu,et al.  MS-Celeb-1M: Challenge of Recognizing One Million Celebrities in the Real World , 2016, IMAWM.

[16]  George K. Atia,et al.  A Multi-criteria Approach for Fast and Outlier-aware Representative Selection from Manifolds , 2020, ArXiv.

[17]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[18]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[20]  Manfred K. Warmuth,et al.  Subsampling for Ridge Regression via Regularized Volume Sampling , 2017, AISTATS.

[21]  Ali Çivril,et al.  Column Subset Selection Problem is UG-hard , 2014, J. Comput. Syst. Sci..

[22]  Dieter Fox,et al.  Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms , 2011, NIPS.

[23]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[24]  Stephen P. Boyd,et al.  Sensor Selection via Convex Optimization , 2009, IEEE Transactions on Signal Processing.

[25]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Trevor Campbell,et al.  Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent , 2018, ICML.

[27]  Gustavo Carneiro,et al.  Bayesian Semantic Instance Segmentation in Open Set World , 2018, ECCV.

[28]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[29]  Andreas Krause,et al.  Practical Coreset Constructions for Machine Learning , 2017, 1703.06476.

[30]  Guo-Jun Qi,et al.  Generalized Loss-Sensitive Adversarial Learning with Manifold Margins , 2018, ECCV.

[31]  Yunhui Liu,et al.  Robust Exemplar Extraction Using Structured Sparse Coding , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[33]  Per Christian Hansen,et al.  Some Applications of the Rank Revealing QR Factorization , 1992, SIAM J. Sci. Comput..

[34]  Y. Jiang,et al.  Spectral Clustering on Multiple Manifolds , 2011, IEEE Transactions on Neural Networks.

[35]  Ming Gu,et al.  Fast Parallel Randomized QR with Column Pivoting Algorithms for Reliable Low-Rank Matrix Approximations , 2017, 2017 IEEE 24th International Conference on High Performance Computing (HiPC).

[36]  Jure Leskovec,et al.  Complete the Look: Scene-Based Complementary Product Recommendation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Farokh Marvasti,et al.  Transductive multi-label learning from missing data using smoothed rank function , 2020, Pattern Analysis and Applications.

[38]  Malik Magdon-Ismail,et al.  On selecting a maximum volume sub-matrix of a matrix and related problems , 2009, Theor. Comput. Sci..

[39]  T. Chan Rank revealing QR factorizations , 1987 .

[40]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[41]  Yu Tian,et al.  CR-GAN: Learning Complete Representations for Multi-view Generation , 2018, IJCAI.

[42]  Shin Matsushima,et al.  Selective Sampling-based Scalable Sparse Subspace Clustering , 2019, NeurIPS.

[43]  Yaroslav Shitov,et al.  Column subset selection is NP-complete , 2017, Linear Algebra and its Applications.

[44]  Guillermo Sapiro,et al.  See all by looking at a few: Sparse modeling for finding representative objects , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Junsong Yuan,et al.  From Keyframes to Key Objects: Video Summarization by Representative Object Proposal Selection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Fan Chung Graham,et al.  A Random Graph Model for Power Law Graphs , 2001, Exp. Math..

[47]  P. Pattison,et al.  Cumulated social roles: The duality of persons and their algebras☆ , 1986 .

[48]  S. Shankar Sastry,et al.  Dissimilarity-Based Sparse Subset Selection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Nazanin Rahnavard,et al.  E-Optimal Sensor Selection for Compressive Sensing-Based Purposes , 2020, IEEE Transactions on Big Data.

[50]  Christos Boutsidis,et al.  An improved approximation algorithm for the column subset selection problem , 2008, SODA.

[51]  P. A. Vijaya,et al.  Leaders - Subleaders: An efficient hierarchical clustering algorithm for large data sets , 2004, Pattern Recognit. Lett..

[52]  Charles Elkan,et al.  Optimal Thresholding of Classifiers to Maximize F1 Measure , 2014, ECML/PKDD.

[53]  Christos Boutsidis,et al.  Faster Subset Selection for Matrices and Applications , 2011, SIAM J. Matrix Anal. Appl..

[54]  G. Golub,et al.  Tracking a few extreme singular values and vectors in signal processing , 1990, Proc. IEEE.

[55]  Hao Lu,et al.  From Open Set to Closed Set: Counting Objects by Spatial Divide-and-Conquer , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[56]  Suvrit Sra,et al.  Polynomial time algorithms for dual volume sampling , 2017, NIPS.

[57]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[58]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[59]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[60]  Christos Boutsidis,et al.  Near-Optimal Column-Based Matrix Reconstruction , 2014, SIAM J. Comput..

[61]  Suvrit Sra,et al.  Elementary Symmetric Polynomials for Optimal Experimental Design , 2017, NIPS.

[62]  Petros Drineas,et al.  Column Selection via Adaptive Sampling , 2015, NIPS.

[63]  Farokh Marvasti,et al.  A Novel Approach to Quantized Matrix Completion Using Huber Loss Measure , 2018, IEEE Signal Processing Letters.

[64]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[65]  Le Song,et al.  Iterative Learning with Open-set Noisy Labels , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.