Gaussian bandwidth selection for manifold learning and classification

Kernel methods play a critical role in many machine learning algorithms. They are useful in manifold learning, classification, clustering and other data analysis tasks. Setting the kernel’s scale parameter, also referred to as the kernel’s bandwidth, highly affects the performance of the task in hand. We propose to set a scale parameter that is tailored to one of two types of tasks: classification and manifold learning. For manifold learning, we seek a scale which is best at capturing the manifold’s intrinsic dimension. For classification, we propose three methods for estimating the scale, which optimize the classification results in different senses. The proposed frameworks are simulated on artificial and on real datasets. The results show a high correlation between optimal classification rates and the estimated scales. Finally, we demonstrate the approach on a seismic event classification task.

[1]  W. Luo,et al.  Face recognition based on Laplacian Eigenmaps , 2011, 2011 International Conference on Computer Science and Service System (CSSS).

[2]  Yuedong Yang,et al.  Deep Learning Enables Accurate Diagnosis of Novel Coronavirus (COVID-19) With CT Images , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  I. Jolliffe Principal Component Analysis , 2005 .

[4]  Matthias Ohrnberger,et al.  Constructing a Hidden Markov Model based earthquake detector: application to induced seismicity , 2012 .

[5]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[6]  Alessandro Rozza,et al.  DANCo: An intrinsic dimensionality estimator exploiting angle and norm concentration , 2014, Pattern Recognit..

[7]  Amir Averbuch,et al.  Multi-channel fusion for seismic event detection and classification , 2016, 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE).

[8]  Roy R. Lederman,et al.  Common Manifold Learning Using Alternating-Diffusion , 2015 .

[9]  Manfred Joswig Pattern Recognition for Earthquake Detection , 1987, ASST.

[10]  Hongbin Zha,et al.  Riemannian Manifold Learning for Nonlinear Dimensionality Reduction , 2006, ECCV.

[11]  Ronald R. Coifman,et al.  Graph Laplacian Tomography From Unknown Random Projections , 2008, IEEE Transactions on Image Processing.

[12]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[13]  Nello Cristianini,et al.  Dynamically Adapting Kernels in Support Vector Machines , 1998, NIPS.

[14]  Alessandro Rozza,et al.  DANCo: Dimensionality from Angle and Norm Concentration , 2012, ArXiv.

[15]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[16]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[17]  Anna Esposito,et al.  Discrimination of Earthquakes and Underwater Explosions Using Neural Networks , 2003 .

[18]  Francesco Camastra,et al.  Data dimensionality estimation methods: a survey , 2003, Pattern Recognit..

[19]  J. Moser On the volume elements on a manifold , 1965 .

[20]  Robert R. Blandford,et al.  Seismic event discrimination , 1982 .

[21]  Matthias Hein,et al.  Intrinsic dimensionality estimation of submanifolds in Rd , 2005, ICML.

[22]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[23]  William R. Walter,et al.  A comparison of regional-phase amplitude ratio measurement techniques , 1997, Bulletin of the Seismological Society of America.

[24]  Yoel Shkolnisky,et al.  Multi-View Kernel Consensus For Data Analysis and Signal Processing , 2016, ArXiv.

[25]  Arie Yeredor,et al.  MultiView Diffusion Maps , 2015, Inf. Fusion.

[26]  B. Nadler,et al.  Diffusion maps, spectral clustering and reaction coordinates of dynamical systems , 2005, math/0503445.

[27]  Jaime Carbonell,et al.  On the parameter optimization of Support Vector Machines for binary classification , 2012, J. Integr. Bioinform..

[28]  Matthias Ohrnberger,et al.  Continuous automatic classification of seismic signals of volcanic origin at Mt. Merapi, Java, Indonesia , 2001 .

[29]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[30]  Anil K. Jain,et al.  An Intrinsic Dimensionality Estimator from Near-Neighbor Information , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Bo Xu,et al.  A deep learning algorithm using CT images to screen for Corona virus disease (COVID-19) , 2020, European Radiology.

[32]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[33]  Charles P. Staelin Parameter selection for support vector machines , 2002 .

[34]  Timo Tiira,et al.  Discrimination of nuclear explosions and earthquakes from teleseismic distances with a local network of short period seismic stations using artificial neural networks , 1996 .

[35]  Ronald R. Coifman,et al.  Data Fusion and Multicue Data Matching by Diffusion Maps , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Lei Shi,et al.  Fast Algorithm for Approximate k-Nearest Neighbor Graph Construction , 2013, 2013 IEEE 13th International Conference on Data Mining Workshops.

[37]  Emanuël A. P. Habets,et al.  Nonlinear Filtering With Variable Bandwidth Exponential Kernels , 2020, IEEE Transactions on Signal Processing.

[38]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Timo Tiira,et al.  Automatic classification of seismic events within a regional seismograph network , 2015, Comput. Geosci..

[40]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41]  D.V. Anderson,et al.  Parameter Estimation for Manifold Learning, Through Density Estimation , 2006, 2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing.

[42]  Arie Yeredor,et al.  Multi-view diffusion maps , 2020, Inf. Fusion.

[43]  Fengxi Song,et al.  Feature Selection Using Principal Component Analysis , 2010, 2010 International Conference on System Science, Engineering Design and Manufacturing Informatization.

[44]  Alexander Wong,et al.  COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images , 2020, Scientific reports.

[45]  Chris H. Q. Ding,et al.  On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering , 2005, SDM.

[46]  Arie Yeredor,et al.  Bandwidth selection for kernel-based classification , 2016, 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE).

[47]  V. N. Bogaevski,et al.  Matrix Perturbation Theory , 1991 .

[48]  Gerard V. Trunk,et al.  Stastical Estimation of the Intrinsic Dimensionality of a Noisy Signal Collection , 1976, IEEE Transactions on Computers.

[49]  Robert P. W. Duin,et al.  An Evaluation of Intrinsic Dimensionality Estimators , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  Mohamed Medhat Gaber,et al.  Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network , 2020, Applied Intelligence.

[51]  Alexander Wong,et al.  COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest Radiography Images , 2020, ArXiv.

[52]  Arie Yeredor,et al.  Musical key extraction using diffusion maps , 2015, Signal Process..

[53]  G. Stewart,et al.  Matrix Perturbation Theory , 1990 .

[54]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[55]  Amir Averbuch,et al.  Earthquake-explosion discrimination using diffusion maps , 2016 .

[56]  António E. Ruano,et al.  Seismic detection using support vector machines , 2014, Neurocomputing.

[57]  Prabira Kumar Sethy,et al.  Detection of Coronavirus Disease (COVID-19) Based on Deep Features , 2020 .

[58]  Amit Singer,et al.  Detecting intrinsic slow variables in stochastic dynamical systems by anisotropic diffusion maps , 2009, Proceedings of the National Academy of Sciences.

[59]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[60]  Qi Tian,et al.  Feature selection using principal feature analysis , 2007, ACM Multimedia.

[61]  Amir Averbuch,et al.  Multiview Kernels for Low-Dimensional Modeling of Seismic Events , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[62]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[63]  Donat Fäh,et al.  Classifying seismic waveforms from scratch: a case study in the alpine environment , 2013 .

[64]  Keinosuke Fukunaga,et al.  An Algorithm for Finding Intrinsic Dimensionality of Data , 1971, IEEE Transactions on Computers.

[65]  Yoel Shkolnisky,et al.  Multi-view kernel consensus for data analysis , 2016 .

[66]  G. W. Stewart,et al.  Stochastic Perturbation Theory , 1990, SIAM Rev..