Ensemble Manifold Segmentation for Model Distillation and Semi-supervised Learning

Manifold theory has been the central concept of many learning methods. However, learning modern CNNs with manifold structures has not raised due attention, mainly because of the inconvenience of imposing manifold structures onto the architecture of the CNNs. In this paper we present ManifoldNet, a novel method to encourage learning of manifold-aware representations. Our approach segments the input manifold into a set of fragments. By assigning the corresponding segmentation id as a pseudo label to every sample, we convert the problem of preserving the local manifold structure into a point-wise classification task. Due to its unsupervised nature, the segmentation tends to be noisy. We mitigate this by introducing ensemble manifold segmentation (EMS). EMS accounts for the manifold structure by dividing the training data into an ensemble of classification training sets that contain samples of local proximity. CNNs are trained on these ensembles under a multi-task learning framework to conform to the manifold. ManifoldNet can be trained with only the pseudo labels or together with task-specific labels. We evaluate ManifoldNet on two different tasks: network imitation (distillation) and semi-supervised learning. Our experiments show that the manifold structures are effectively utilized for both unsupervised and semi-supervised learning.

[1]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[2]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[3]  Satoshi Ito,et al.  Random ensemble metrics for object recognition , 2011, 2011 International Conference on Computer Vision.

[4]  Jost Tobias Springenberg,et al.  Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks , 2015, ICLR.

[5]  Richard C. Rose,et al.  Manifold regularized deep neural networks , 2014, INTERSPEECH.

[6]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[8]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[9]  Shuicheng Yan,et al.  Neighborhood preserving embedding , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[10]  Hossein Mobahi,et al.  Deep Learning via Semi-supervised Embedding , 2012, Neural Networks: Tricks of the Trade.

[11]  René Vidal,et al.  Sparse Manifold Clustering and Embedding , 2011, NIPS.

[12]  Abhinav Gupta,et al.  Unsupervised Learning of Visual Representations Using Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Luc Van Gool,et al.  Ensemble Projection for Semi-supervised Image Classification , 2013, 2013 IEEE International Conference on Computer Vision.

[14]  Luc Van Gool,et al.  Metric imitation by manifold transfer for efficient vision applications , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[16]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[17]  Luc Van Gool,et al.  Ensemble Partitioning for Unsupervised Image Categorization , 2012, ECCV.

[18]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19]  Robert Pless,et al.  Manifold clustering , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[20]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[21]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2002, ArXiv.

[22]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[23]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[24]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[25]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[26]  Antonio Torralba,et al.  SoundNet: Learning Sound Representations from Unlabeled Video , 2016, NIPS.

[27]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[29]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[32]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[33]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[34]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[35]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[36]  Fumin Shen,et al.  Inductive Hashing on Manifolds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[38]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[39]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[40]  Nir Ailon,et al.  Semi-supervised deep learning by metric embedding , 2016, ICLR.

[41]  Ameet Talwalkar,et al.  Large-scale manifold learning , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[43]  Jitendra Malik,et al.  Cross Modal Distillation for Supervision Transfer , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[45]  Alexei A. Efros,et al.  Unsupervised Discovery of Mid-Level Discriminative Patches , 2012, ECCV.

[46]  Yihong Gong,et al.  Deep Learning with Kernel Regularization for Visual Recognition , 2008, NIPS.

[47]  Shin Ishii,et al.  Distributional Smoothing with Virtual Adversarial Training , 2015, ICLR 2016.

[48]  Junjie Wu,et al.  Spectral Ensemble Clustering , 2015, KDD.

[49]  Alexei A. Efros,et al.  Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).