Network-Aware Optimization of Distributed Learning for Fog Computing

Fog computing promises to enable machine learning tasks to scale to large amounts of data by distributing processing across connected devices. Two key challenges to achieving this are (i) heterogeneity in devices’ compute resources and (ii) topology constraints on which devices can communicate. We are the first to address these challenges by developing a network-aware distributed learning optimization methodology where devices process data for a task locally and send their learnt parameters to a server for aggregation at certain time intervals. Unlike traditional federated learning frameworks, our method enables devices to offload their data processing tasks, with these decisions determined through a convex data transfer optimization problem that trades off costs associated with devices processing, offloading, and discarding data points. We analytically characterize the optimal data transfer solution for different fog network topologies, showing for example that the value of a device offloading is approximately linear in the range of computing costs in the network. Our subsequent experiments on both synthetic and real-world datasets we collect confirm that our algorithms are able to improve network resource utilization substantially without sacrificing the accuracy of the learned model.

[1]  Parijat Dube,et al.  Slow and Stale Gradients Can Win the Race , 2018, IEEE Journal on Selected Areas in Information Theory.

[2]  Nikko Strom,et al.  Scalable distributed DNN training using commodity GPU cloud computing , 2015, INTERSPEECH.

[3]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[4]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[5]  Tao Zhang,et al.  Fog and IoT: An Overview of Research Opportunities , 2016, IEEE Internet of Things Journal.

[6]  Albert Y. Zomaya,et al.  Federated Learning over Wireless Networks: Optimization Model Design and Analysis , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[7]  Zhenming Liu,et al.  DeepDecision: A Mobile Deep Learning Framework for Edge Video Analytics , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[8]  Martin J. Wainwright,et al.  Information-theoretic lower bounds for distributed statistical estimation with communication constraints , 2013, NIPS.

[9]  Xiaoyan Sun,et al.  Communication-Efficient Federated Deep Learning With Layerwise Asynchronous Model Update and Temporally Weighted Aggregation , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Xiaoyan Sun,et al.  Communication-Efficient Federated Deep Learning With Layerwise Asynchronous Model Update and Temporally Weighted Aggregation , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[12]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[13]  Prateek Mittal,et al.  Learning Informative and Private Representations via Generative Adversarial Networks , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[14]  Klaus-Robert Müller,et al.  Robust and Communication-Efficient Federated Learning From Non-i.i.d. Data , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Kin K. Leung,et al.  Adaptive Federated Learning in Resource Constrained Edge Computing Systems , 2018, IEEE Journal on Selected Areas in Communications.

[16]  Dan Wang,et al.  Dynamic Adaptive DNN Surgery for Inference Acceleration on the Edge , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[17]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[18]  Don Towsley,et al.  The Role of Network Topology for Distributed Machine Learning , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[19]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[20]  Xiaofei Wang,et al.  Convergence of Edge Computing and Deep Learning: A Comprehensive Survey , 2019, IEEE Communications Surveys & Tutorials.

[21]  Wei Shi,et al.  A Push-Pull Gradient Method for Distributed Optimization in Networks , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[22]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[23]  Pan Hui,et al.  Mobile Augmented Reality Survey: From Where We Are to Where We Go , 2017, IEEE Access.

[24]  Junaid Ansari,et al.  Ultra-reliable and low-latency communication for wireless factory automation: From LTE to 5G , 2016, 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA).

[25]  Zhenming Liu,et al.  On the efficiency of social recommender networks , 2016, 2015 IEEE Conference on Computer Communications (INFOCOM).

[26]  Kenneth Heafield,et al.  Sparse Communication for Distributed Gradient Descent , 2017, EMNLP.

[27]  Tianbao Yang,et al.  Trading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent , 2013, NIPS.

[28]  Mianxiong Dong,et al.  Learning IoT in Edge: Deep Learning for the Internet of Things with Edge Computing , 2018, IEEE Network.

[29]  Tara Javidi,et al.  Peer-to-peer Federated Learning on Graphs , 2019, ArXiv.

[30]  Shancang Li,et al.  A Heuristic Offloading Method for Deep Learning Edge Services in 5G Networks , 2019, IEEE Access.

[31]  Mung Chiang,et al.  Decomposing Data Analytics in Fog Networks , 2017, SenSys.

[32]  H. T. Kung,et al.  Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[33]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[34]  Chita R. Das,et al.  Stochastic Modeling and Optimization of Stragglers , 2018, IEEE Transactions on Cloud Computing.

[35]  Ines Gloeckner Networked Life 20 Questions And Answers , 2016 .