论文信息 - Quickest convergence of online algorithms via data selection

Quickest convergence of online algorithms via data selection

Big data applications demand efficient solvers capable of providing accurate solutions to large-scale problems at affordable computational costs. Processing data sequentially, online algorithms offer attractive means to deal with massive data sets. However, they may incur prohibitive complexity in high-dimensional scenarios if the entire data set is processed. It is therefore necessary to confine computations to an informative subset. While existing approaches have focused on selecting a prescribed fraction of the available data vectors, the present paper capitalizes on this degree of freedom to accelerate the convergence of a generic class of online algorithms in terms of processing time/computational resources by balancing the required burden with a metric of how informative each datum is. The proposed method is illustrated in a linear regression setting, and simulations corroborate the superior convergence rate of the recursive least-squares algorithm when the novel data selection is effected.

[1] Lihua Xie,et al. Asymptotically Optimal Parameter Estimation With Scheduled Measurements , 2013, IEEE Transactions on Signal Processing.

[2] R. Vershynin,et al. A Randomized Kaczmarz Algorithm with Exponential Convergence , 2007, math/0702226.

[3] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[4] Pramod K. Varshney,et al. Sequential Bayesian Estimation With Censored Data for Multi-Sensor Systems , 2014, IEEE Transactions on Signal Processing.

[5] Gang Wang,et al. Adaptive censoring for large-scale regressions , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6] Haris Vikalo,et al. Greedy sensor selection: Leveraging submodularity , 2010, 49th IEEE Conference on Decision and Control (CDC).

[7] Tzay Y. Young,et al. Classification, Estimation and Pattern Recognition , 1974 .

[8] Giorgio Battistelli,et al. Data-driven strategies for selective data transmission in sensor networks , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[9] Deanna Needell,et al. Stochastic gradient descent and the randomized Kaczmarz algorithm , 2013, ArXiv.

[10] Shai Shalev-Shwartz,et al. Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[11] Yue M. Lu,et al. Randomized Kaczmarz algorithms: Exact MSE analysis and optimal sampling probabilities , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).

[12] Michael W. Mahoney. Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..

[13] Stephen P. Boyd,et al. Sensor Selection via Convex Optimization , 2009, IEEE Transactions on Signal Processing.

[14] Georgios B. Giannakis,et al. Online Censoring for Large-Scale Regressions with Application to Streaming Big Data , 2015, IEEE Transactions on Signal Processing.

[15] Gonzalo Mateos,et al. Modeling and Optimization for Big Data Analytics: (Statistical) learning tools for our era of data deluge , 2014, IEEE Signal Processing Magazine.