Adaptive censoring for large-scale regressions

Albeit being in the big data era, a significant percentage of data accrued can be overlooked while maintaining reasonable quality of statistical inference at affordable complexity. By capitalizing on data redundancy, interval censoring is leveraged here to cope with the scarcity of resources needed for data exchanging, storing, and processing. By appropriately modifying least-squares regression, first- and second-order algorithms with complementary strengths that operate on censored data are developed for large-scale regressions. Theoretical analysis and simulated tests corroborate their efficacy relative to contemporary competing alternatives.

[1]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[2]  Georgios B. Giannakis,et al.  Sensor-Centric Data Reduction for Estimation With WSNs via Censoring and Quantization , 2012, IEEE Transactions on Signal Processing.

[3]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[4]  Gang Wang,et al.  Power Scheduling of Kalman Filtering in Wireless Sensor Networks with Data Packet Drops , 2013 .

[5]  Xuan Kong,et al.  Adaptive Signal Processing Algorithms: Stability and Performance , 1994 .

[6]  R. Vershynin,et al.  A Randomized Kaczmarz Algorithm with Exponential Convergence , 2007, math/0702226.

[7]  Michael W. Mahoney Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..

[8]  H. Poor,et al.  Censoring for Collaborative Spectrum Sensing in Cognitive Radios , 2007, 2007 Conference Record of the Forty-First Asilomar Conference on Signals, Systems and Computers.

[9]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[10]  D. Cox Regression Models and Life-Tables , 1972 .

[11]  Gang Wang,et al.  Online reconstruction from big data via compressive censoring , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[12]  Michael W. Mahoney Algorithmic and Statistical Perspectives on Large-Scale Data Analysis , 2010, ArXiv.

[13]  T. Amemiya Tobit models: A survey , 1984 .

[14]  Lihua Xie,et al.  Asymptotically Optimal Parameter Estimation With Scheduled Measurements , 2013, IEEE Transactions on Signal Processing.

[15]  J. Tobin Estimation of Relationships for Limited Dependent Variables , 1958 .

[16]  Ambuj Tewari,et al.  Composite objective mirror descent , 2010, COLT 2010.

[17]  Ludger Evers,et al.  Sparse kernel methods for high-dimensional survival data , 2008, Bioinform..