Robust Local Scaling Using Conditional Quantiles of Graph Similarities

Spectral analysis of neighborhood graphs is one of the most widely used techniques for exploratory data analysis, with applications ranging from machine learning to social sciences. In such applications, it is typical to first encode relationships between the data samples using an appropriate similarity function. Popular neighborhood construction techniques such as k-nearest neighbor (k-NN) graphs are known to be very sensitive to the choice of parameters, and more importantly susceptible to noise and varying densities. In this paper, we propose the use of quantile analysis to obtain local scale estimates for neighborhood graph construction. To this end, we build an auto-encoding neural network approach for inferring conditional quantiles of a similarity function, which are subsequently used to obtain robust estimates of the local scales. In addition to being highly resilient to noise or outlying data, the proposed approach does not require extensive parameter tuning unlike several existing methods. Using applications in spectral clustering and single-example label propagation, we show that the proposed neighborhood graphs outperform existing locally scaled graph construction approaches.

[1]  George Kollios,et al.  k-nearest neighbors in uncertain graphs , 2010, Proc. VLDB Endow..

[2]  Hui Zou,et al.  Computational Statistics and Data Analysis Regularized Simultaneous Model Selection in Multiple Quantiles Regression , 2022 .

[3]  Enhong Chen,et al.  Learning Deep Representations for Graph Clustering , 2014, AAAI.

[4]  Peter Lindstrom,et al.  Locally-scaled spectral clustering using empty region graphs , 2012, KDD.

[5]  Laxmikant V. Kalé,et al.  Identifying the Culprits Behind Network Congestion , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[6]  Carl Dean Meyer,et al.  Stochastic Data Clustering , 2010, SIAM J. Matrix Anal. Appl..

[7]  R. Koenker,et al.  Regression Quantiles , 2007 .

[8]  Karthikeyan Natesan Ramamurthy,et al.  Beyond L2-loss functions for learning sparse models , 2014, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Deniz Yuret,et al.  Locally Scaled Density Based Clustering , 2007, ICANNGA.

[10]  R. Koenker,et al.  Reappraising Medfly Longevity , 2001 .

[11]  Kush R. Varshney,et al.  Quantile regression for workforce analytics , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[12]  Hong Chang,et al.  Robust path-based spectral clustering with application to image segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[13]  Ronny Luss,et al.  Orthogonal Matching Pursuit for Sparse Quantile Regression , 2014, 2014 IEEE International Conference on Data Mining.

[14]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[15]  Zhenguo Li,et al.  Noise Robust Spectral Clustering , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[16]  Mark Herbster,et al.  Online learning over graphs , 2005, ICML.

[17]  Carl Dean Meyer,et al.  A Flexible Iterative Framework for Consensus Clustering , 2014, ArXiv.

[18]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[19]  Ulrike von Luxburg,et al.  How the result of graph clustering methods depends on the construction of the graph , 2011, ArXiv.

[20]  Hong Chang,et al.  Graph Laplacian Kernels for Object Classification from a Single Example , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Ling Huang,et al.  Spectral Clustering with Perturbed Data , 2008, NIPS.

[22]  Mikhail Belkin,et al.  Consistency of spectral clustering , 2008, 0804.0678.

[23]  Alexander J. Smola,et al.  Nonparametric Quantile Estimation , 2006, J. Mach. Learn. Res..