Information-theoretic lower bounds for distributed statistical estimation with communication constraints

We establish lower bounds on minimax risks for distributed statistical estimation under a communication budget. Such lower bounds reveal the minimum amount of communication required by any procedure to achieve the centralized minimax-optimal rates for statistical estimation. We study two classes of protocols: one in which machines send messages independently, and a second allowing for interactive communication. We establish lower bounds for several problems, including various types of location models, as well as for parameter estimation in regression models.

[1]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[2]  Harold Abelson,et al.  Lower bounds on information transfer in distributed computations , 1978, 19th Annual Symposium on Foundations of Computer Science (sfcs 1978).

[3]  Andrew Chi-Chih Yao,et al.  Some complexity questions related to distributive computing(Preliminary Report) , 1979, STOC.

[4]  R. Z. Khasʹminskiĭ,et al.  Statistical estimation : asymptotic theory , 1981 .

[5]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[6]  J. Tsitsiklis Decentralized Detection' , 1993 .

[7]  John N. Tsitsiklis,et al.  Data fusion with minimal communication , 1994, IEEE Trans. Inf. Theory.

[8]  E. Kushilevitz,et al.  Communication Complexity: Basics , 1996 .

[9]  Bin Yu Assouad, Fano, and Le Cam , 1997 .

[10]  K. Ball An Elementary Introduction to Modern Convex Geometry , 1997 .

[11]  Yuhong Yang,et al.  Information-theoretic determination of minimax rates of convergence , 1999 .

[12]  Sergio VerdÂ,et al.  Statistical Inference Under Multiterminal Data Compression , 2000 .

[13]  Zhi-Quan Luo,et al.  Universal decentralized estimation in a bandwidth constrained sensor network , 2005, IEEE Transactions on Information Theory.

[14]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[15]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[16]  Gideon S. Mann,et al.  Distributed Training Strategies for the Structured Perceptron , 2010, NAACL.

[17]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[18]  Martin J. Wainwright,et al.  Communication-efficient algorithms for statistical optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[19]  Maria-Florina Balcan,et al.  Distributed Learning, Communication Complexity and Privacy , 2012, COLT.

[20]  Ohad Shamir,et al.  Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..

[21]  Martin J. Wainwright,et al.  Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling , 2010, IEEE Transactions on Automatic Control.

[22]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[23]  Martin J. Wainwright,et al.  Local Privacy, Data Processing Inequalities, and Statistical Minimax Rates , 2013, 1302.3203.

[24]  Martin J. Wainwright,et al.  Distance-based and continuum Fano inequalities with applications to statistical estimation , 2013, ArXiv.