Robust Regression via Heuristic Hard Thresholding

The presence of data noise and corruptions recently invokes increasing attention on Robust Least Squares Regression (RLSR), which addresses the fundamental problem that learns reliable regression coefficients when response variables can be arbitrarily corrupted. Until now, several important challenges still cannot be handled concurrently: 1) exact recovery guarantee of regression coefficients 2) difficulty in estimating the corruption ratio parameter; and 3) scalability to massive dataset. This paper proposes a novel Robust Least squares regression algorithm via Heuristic Hard thresholding (RLHH), that concurrently addresses all the above challenges. Specifically, the algorithm alternately optimizes the regression coefficients and estimates the optimal uncorrupted set via heuristic hard thresholding without corruption ratio parameter until it converges. We also prove that our algorithm benefits from strong guarantees analogous to those of state-of-the-art methods in terms of convergence rates and recovery guarantees. Extensive experiment demonstrates that the effectiveness of our new method is superior to that of existing methods in the recovery of both regression coefficients and uncorrupted sets, with very competitive efficiency.

[1]  John Wright,et al.  Dense Error Correction Via $\ell^1$-Minimization , 2010, IEEE Transactions on Information Theory.

[2]  Joachim M. Buhmann,et al.  Fast and Robust Least Squares Estimation in Corrupted Linear Models , 2014, NIPS.

[3]  Helmut Bölcskei,et al.  Recovery of Sparsely Corrupted Signals , 2011, IEEE Transactions on Information Theory.

[4]  Allen Y. Yang,et al.  Fast ℓ1-minimization algorithms and an application in robust face recognition: A review , 2010, 2010 IEEE International Conference on Image Processing.

[5]  Shie Mannor,et al.  Robust Sparse Regression under Adversarial Corruption , 2013, ICML.

[6]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Mohammed Bennamoun,et al.  Robust Regression for Face Recognition , 2010, 2010 20th International Conference on Pattern Recognition.

[8]  Trac D. Tran,et al.  Exact Recoverability From Dense Corrupted Observations via $\ell _{1}$-Minimization , 2011, IEEE Transactions on Information Theory.

[9]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Aljoscha Smolic,et al.  Robust global motion estimation using a simplified M-estimator approach , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[11]  John Wright,et al.  Dense Error Correction via L1-Minimization , 2008, 0809.0199.

[12]  R. Fildes Journal of the American Statistical Association : William S. Cleveland, Marylyn E. McGill and Robert McGill, The shape parameter for a two variable graph 83 (1988) 289-300 , 1989 .

[13]  L. Miles,et al.  2000 , 2000, RDH.

[14]  Vanda M. Lourenço,et al.  Robust linear regression methods in association studies , 2011, Bioinform..

[15]  P. J. Huber,et al.  The Basic Types of Estimates , 2005 .

[16]  Yoonsuh Jung,et al.  Robust regression for highly corrupted response by shifting outliers , 2016 .

[17]  Prateek Jain,et al.  Robust Regression via Hard Thresholding , 2015, NIPS.

[18]  Michael Muma,et al.  Robust Estimation in Signal Processing: A Tutorial-Style Treatment of Fundamental Concepts , 2012, IEEE Signal Processing Magazine.

[19]  Rajen Dinesh Shah,et al.  Statistical modelling , 2015 .

[20]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[21]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[22]  Yiyuan She,et al.  Outlier Detection Using Nonconvex Penalized Regression , 2010, ArXiv.

[23]  William IEEE TRANSACTIONS ON INFORMATION THEORY VOL XX NO Y MONTH Signal Propagation and Noisy Circuits , 2019 .