Adaptive Communication Bounds for Distributed Online Learning

We consider distributed online learning protocols that control the exchange of information between local learners in a round-based learning scenario. The learning performance of such a protocol is intuitively optimal if approximately the same loss is incurred as in a hypothetical serial setting. If a protocol accomplishes this, it is inherently impossible to achieve a strong communication bound at the same time. In the worst case, every input is essential for the learning performance, even for the serial setting, and thus needs to be exchanged between the local learners. However, it is reasonable to demand a bound that scales well with the hardness of the serialized prediction problem, as measured by the loss received by a serial online learning algorithm. We provide formal criteria based on this intuition and show that they hold for a simplified version of a previously published protocol.

[1]  Assaf Schuster,et al.  Prediction-based geometric monitoring over distributed data streams , 2012, SIGMOD Conference.

[2]  Gideon S. Mann,et al.  Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models , 2009, NIPS.

[3]  Assaf Schuster,et al.  Shape Sensitive Geometric Monitoring , 2008, IEEE Transactions on Knowledge and Data Engineering.

[4]  Ohad Shamir,et al.  Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..

[5]  Assaf Schuster,et al.  A geometric approach to monitoring threshold functions over distributed data streams , 2006, Ubiquitous Knowledge Discovery.

[6]  Assaf Schuster,et al.  Communication-Efficient Distributed Variance Monitoring and Outlier Detection for Multivariate Time Series , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[7]  Assaf Schuster,et al.  Communication-Efficient Distributed Online Prediction by Dynamic Model Synchronization , 2014, ECML/PKDD.

[8]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[9]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[10]  Assaf Schuster,et al.  A Geometric Approach to Monitoring Threshold Functions over Distributed Data Streams , 2010, Ubiquitous Knowledge Discovery.

[11]  SchusterAssaf,et al.  A geometric approach to monitoring threshold functions over distributed data streams , 2007 .

[12]  Alexander J. Smola,et al.  Parallelized Stochastic Gradient Descent , 2010, NIPS.