Distributed quantile regression for massive heterogeneous data

Abstract Massive data sets pose great challenges to data analysis because of their heterogeneous data structure and limited computer memory. Jordan et al. (2019, Journal of American Statistical Association) has proposed a communication-efficient surrogate likelihood (CSL) method to solve distributed learning problems. However, their method cannot be directly applied to quantile regression because the loss function in quantile regression does not meet the smoothness requirement in CSL method. In this paper, we extend CSL method so that it is applicable to quantile regression problems. The key idea is to construct a surrogate loss function which relates to the local data only through subgradients of the loss function. The alternating direction method of multipliers (ADMM) algorithm is used to address computational issues caused by the non-smooth loss function. Our theoretical analysis establishes the consistency and asymptotic normality for the proposed method. Simulation studies and applications to real data show that our method works well.

[1]  Martin J. Wainwright,et al.  Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.

[2]  Qiang Liu,et al.  Communication-efficient Sparse Regression , 2017, J. Mach. Learn. Res..

[3]  Martin J. Wainwright,et al.  Communication-efficient algorithms for statistical optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[4]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[5]  Lei Wang,et al.  Communication-efficient estimation of high-dimensional quantile regression , 2020 .

[6]  Ping Ma,et al.  A statistical perspective on algorithmic leveraging , 2013, J. Mach. Learn. Res..

[7]  Runze Li,et al.  Quantile Regression for Analyzing Heterogeneity in Ultra-High Dimension , 2012, Journal of the American Statistical Association.

[8]  Paulo Cortez,et al.  Modeling wine preferences by data mining from physicochemical properties , 2009, Decis. Support Syst..

[9]  Keith Knight,et al.  Limiting distributions for $L\sb 1$ regression estimators under general conditions , 1998 .

[10]  Xi Chen,et al.  Distributed High-dimensional Regression Under a Quantile Loss Function , 2019, J. Mach. Learn. Res..

[11]  Yun Yang,et al.  Communication-Efficient Distributed Statistical Inference , 2016, Journal of the American Statistical Association.

[12]  Purnamrita Sarkar,et al.  A scalable bootstrap for massive data , 2011, 1112.5016.

[13]  Martin J. Wainwright,et al.  Divide and conquer kernel ridge regression: a distributed algorithm with minimax optimal rates , 2013, J. Mach. Learn. Res..

[14]  Chong Wang,et al.  Asymptotically Exact, Embarrassingly Parallel MCMC , 2013, UAI.

[15]  Michael W. Mahoney,et al.  Quantile Regression for Large-Scale Applications , 2013, SIAM J. Sci. Comput..

[16]  Ameet Talwalkar,et al.  Divide-and-Conquer Matrix Factorization , 2011, NIPS.

[17]  Minge Xie,et al.  A Split-and-Conquer Approach for Analysis of Extraordinarily Large Data , 2014 .

[18]  R. Koenker,et al.  Regression Quantiles , 2007 .

[19]  Gideon S. Mann,et al.  Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models , 2009, NIPS.

[20]  Tengyu Ma,et al.  On Communication Cost of Distributed Statistical Estimation and Dimensionality , 2014, NIPS.

[21]  R. Jennrich Asymptotic Properties of Non-Linear Least Squares Estimators , 1969 .

[22]  Liqun Yu,et al.  ADMM for Penalized Quantile Regression in Big Data , 2017 .

[23]  Qifa Xu,et al.  Block average quantile regression for massive dataset , 2017, Statistical Papers.

[24]  Michael I. Jordan On statistics, computation and scalability , 2013, ArXiv.

[25]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[26]  Xi Chen,et al.  Quantile regression under memory constraint , 2018, The Annals of Statistics.

[27]  Ohad Shamir,et al.  Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.

[28]  Shiqian Ma,et al.  ADMM for High-Dimensional Sparse Penalized Quantile Regression , 2018, Technometrics.