Features in Deep Learning Architectures with Unsupervised Kernel k-Means

Deep learning technology and related algorithms have dramatically broken landmark records for a broad range of learning problems in vision, speech, audio, and text processing. Meanwhile, kernel methods have found common-place usage due to their nonlinear expressive power and elegant optimization formulation. Based on recent progress in learning high-level, class-specific features in unlabeled data, we improve upon the result by combining nonlinear kernels and multi-layer (deep) architecture, which we apply at scale. In particular, our experimentation is based on k-means with an RBF kernel, though it is a straightforward extension to other unsupervised clustering techniques and other reproducing kernel Hilbert spaces. With the proposed method, we discover features distilled from unorganized images. We augment high-level feature invariance by pooling techniques.

[1]  P. Mermelstein,et al.  Distance measures for speech recognition, psychological and instrumental , 1976 .

[2]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[3]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[4]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[5]  Ankur Agarwal,et al.  Hyperfeatures - Multilevel Local Coding for Visual Recognition , 2006, ECCV.

[6]  Li Deng,et al.  An Overview of Deep-Structured Learning for Information Processing , 2011 .

[7]  Geoffrey E. Hinton,et al.  To recognize shapes, first learn to generate images. , 2007, Progress in brain research.

[8]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[9]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[10]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[11]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[12]  Andrew Y. Ng,et al.  Emergence of Object-Selective Features in Unsupervised Feature Learning , 2012, NIPS.

[13]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Geoffrey E. Hinton,et al.  Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[16]  Andrew Y. Ng,et al.  Learning Feature Representations with K-Means , 2012, Neural Networks: Tricks of the Trade.

[17]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[18]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[19]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[20]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.