论文信息 - Predictive Power of Nearest Neighbors Algorithm under Random Perturbation - 字舞流文

Predictive Power of Nearest Neighbors Algorithm under Random Perturbation

We consider a data corruption scenario in the classical $k$ Nearest Neighbors ($k$-NN) algorithm, that is, the testing data are randomly perturbed. Under such a scenario, the impact of corruption level on the asymptotic regret is carefully characterized. In particular, our theoretical analysis reveals a phase transition phenomenon that, when the corruption level $\omega$ is below a critical order (i.e., small-$\omega$ regime), the asymptotic regret remains the same; when it is beyond that order (i.e., large-$\omega$ regime), the asymptotic regret deteriorates polynomially. Surprisingly, we obtain a negative result that the classical noise-injection approach will not help improve the testing performance in the beginning stage of the large-$\omega$ regime, even in the level of the multiplicative constant of asymptotic regret. As a technical by-product, we prove that under different model assumptions, the pre-processed 1-NN proposed in \cite{xue2017achieving} will at most achieve a sub-optimal rate when the data dimension $d>4$ even if $k$ is chosen optimally in the pre-processing step.

Guang Cheng | Qifan Song | Yue Xing | Guang Cheng | Qifan Song | Yue Xing

[1] Aryeh Kontorovich,et al. Fast and Bayes-consistent nearest neighbors , 2020, AISTATS.

[2] Maya R. Gupta,et al. Deep k-NN for Noisy Labels , 2020, ICML.

[3] Cyrus Rashtchian,et al. Robustness for Non-Parametric Classification: A Generic Attack and Defense , 2020, AISTATS.

[4] Wenyuan Xu,et al. DolphinAttack: Inaudible Voice Commands , 2017, CCS.

[5] Ananthram Swami,et al. Crafting adversarial input sequences for recurrent neural networks , 2016, MILCOM 2016 - 2016 IEEE Military Communications Conference.

[6] Patrick D. McDaniel,et al. Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning , 2018, ArXiv.

[7] I-Cheng Yeh,et al. The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients , 2009, Expert Syst. Appl..

[8] Ananthram Swami,et al. The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[9] Guang Cheng,et al. Statistical Guarantees of Distributed Nearest Neighbor Classification , 2020, NeurIPS.

[10] Wei Sun,et al. Stabilized Nearest Neighbor Classifier and its Statistical Properties , 2014, Journal of the American Statistical Association.

[11] A. Tsybakov,et al. Fast learning rates for plug-in classifiers , 2007, 0708.2321.

[12] Richard G. Baraniuk,et al. Adaptive Estimation for Approximate k-Nearest-Neighbor Computations , 2019, AISTATS.

[13] Lirong Xue,et al. Achieving the time of 1-NN, but the accuracy of k-NN , 2017, AISTATS.

[14] Yingying Fan,et al. Classification with imperfect training labels , 2018, Biometrika.

[15] Joshua D. Knowles,et al. Fifty years of pulsar candidate selection: from simple filters to a new principled real-time classification approach , 2016, Monthly Notices of the Royal Astronomical Society.

[16] Lei Chen,et al. Local Distribution in Neighborhood for Classification , 2018, ArXiv.

[17] Patrick D. McDaniel,et al. Adversarial Examples for Malware Detection , 2017, ESORICS.

[18] R. Samworth. Optimal weighted nearest neighbour classifiers , 2011, 1101.5783.

[19] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[20] John C. Duchi,et al. Certifying Some Distributional Robustness with Principled Adversarial Training , 2017, ICLR.

[21] Ananthram Swami,et al. Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[22] Timothy I. Cannings,et al. Local nearest neighbour classification with applications to semi-supervised learning , 2017, The Annals of Statistics.

[23] Jean-Yves Audibert. Classification under polynomial entropy and margin assump-tions and randomized estimators , 2004 .

[24] Somesh Jha,et al. Analyzing the Robustness of Nearest Neighbors to Adversarial Examples , 2017, ICML.

[25] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[26] Mikhail Belkin,et al. Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate , 2018, NeurIPS.

[27] Samy Bengio,et al. Adversarial Machine Learning at Scale , 2016, ICLR.

[28] Hamza Fawzi,et al. Adversarial vulnerability for any classifier , 2018, NeurIPS.

[29] Yaoliang Yu,et al. Additive Approximations in High Dimensional Nonparametric Regression via the SALSA , 2016, ICML.

[30] Guang Cheng,et al. Distributed Nearest Neighbor Classification. , 2018, 1812.05005.

[31] Sanjoy Dasgupta,et al. Rates of Convergence for Nearest Neighbor Classification , 2014, NIPS.

[32] Shay Moran,et al. An adaptive nearest neighbor rule for classification , 2019, NeurIPS.

[33] Cyrus Rashtchian,et al. Adversarial Robustness Through Local Lipschitzness , 2020, ArXiv.

[34] Guang Cheng,et al. Statistical Optimality of Interpolated Nearest Neighbor Algorithms , 2018, ArXiv.

[35] Guang Cheng,et al. Rates of Convergence for Large-scale Nearest Neighbor Classification , 2019, NeurIPS.

[36] Stefan Roth,et al. Neural Nearest Neighbors Networks , 2018, NeurIPS.

[37] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[38] Seyed-Mohsen Moosavi-Dezfooli,et al. Robustness of classifiers: from adversarial to random noise , 2016, NIPS.

[39] Ata Kabán,et al. Fast Rates for a kNN Classifier Robust to Unknown Asymmetric Label Noise , 2019, ICML.

[40] Ata Kabán,et al. Classification with unknown class conditional label noise on non-compact feature spaces , 2019, COLT.