Toward Communication-Efficient Federated Learning in the Internet of Things With Edge Computing
暂无分享,去创建一个
Qi Qi | Jingyu Wang | F. Richard Yu | Shiqi Li | Haifeng Sun | Jianxin Liao | F. Yu | J. Liao | Jingyu Wang | Q. Qi | Haifeng Sun | Shiqi Li
[1] Takuya Akiba,et al. Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes , 2017, ArXiv.
[2] Alex Krizhevsky,et al. One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.
[3] Xiaojuan Qi,et al. ICNet for Real-Time Semantic Segmentation on High-Resolution Images , 2017, ECCV.
[4] Kin K. Leung,et al. Adaptive Federated Learning in Resource Constrained Edge Computing Systems , 2018, IEEE Journal on Selected Areas in Communications.
[5] Dan Alistarh,et al. The Convergence of Sparsified Gradient Methods , 2018, NeurIPS.
[6] Choong Seon Hong,et al. Energy Efficient Federated Learning Over Wireless Communication Networks , 2019, IEEE Transactions on Wireless Communications.
[7] Wei Xu,et al. Energy Efficient Resource Allocation in Machine-to-Machine Communications With Multiple Access and Energy Harvesting for IoT , 2017, IEEE Internet of Things Journal.
[8] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[9] Deniz Gündüz,et al. Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).
[10] Cong Xu,et al. TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning , 2017, NIPS.
[11] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[12] Takayuki Nishio,et al. Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).
[13] Kenneth Heafield,et al. Sparse Communication for Distributed Gradient Descent , 2017, EMNLP.
[14] Jiawei Jiang,et al. Heterogeneity-aware Distributed Parameter Servers , 2017, SIGMOD Conference.
[15] Walid Saad,et al. A Joint Learning and Communications Framework for Federated Learning Over Wireless Networks , 2021, IEEE Transactions on Wireless Communications.
[16] Ji Liu,et al. Gradient Sparsification for Communication-Efficient Distributed Optimization , 2017, NeurIPS.
[17] Ursula Challita,et al. Artificial Neural Networks-Based Machine Learning for Wireless Networks: A Tutorial , 2017, IEEE Communications Surveys & Tutorials.
[18] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[19] Albert Y. Zomaya,et al. Federated Learning over Wireless Networks: Optimization Model Design and Analysis , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.
[20] James Demmel,et al. Fast Deep Neural Network Training on Distributed Systems and Cloud TPUs , 2019, IEEE Transactions on Parallel and Distributed Systems.
[21] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[22] Xiaowei Zhou,et al. MonoCap: Monocular Human Motion Capture using a CNN Coupled with a Geometric Prior , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[23] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[24] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[25] Kaibin Huang,et al. Broadband Analog Aggregation for Low-Latency Federated Edge Learning , 2018, IEEE Transactions on Wireless Communications.
[26] Yuanzhou Yang,et al. Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes , 2018, ArXiv.
[27] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[28] James Demmel,et al. Reducing BERT Pre-Training Time from 3 Days to 76 Minutes , 2019, ArXiv.
[29] Xu Chen,et al. In-Edge AI: Intelligentizing Mobile Edge Computing, Caching and Communication by Federated Learning , 2018, IEEE Network.
[30] Dan Alistarh,et al. QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.
[31] Sam Ade Jacobs,et al. Communication Quantization for Data-Parallel Training of Deep Neural Networks , 2016, 2016 2nd Workshop on Machine Learning in HPC Environments (MLHPC).
[32] Chunpeng Wu,et al. SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning , 2018, 1805.07898.
[33] Yijun Huang,et al. Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization , 2015, NIPS.
[34] Bolei Zhou,et al. Semantic Understanding of Scenes Through the ADE20K Dataset , 2016, International Journal of Computer Vision.
[35] Nikko Strom,et al. Scalable distributed DNN training using commodity GPU cloud computing , 2015, INTERSPEECH.
[36] Zhi Ding,et al. Federated Learning via Over-the-Air Computation , 2018, IEEE Transactions on Wireless Communications.
[37] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[38] Wei Wang,et al. CMFL: Mitigating Communication Overhead for Federated Learning , 2019, 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS).
[39] Seunghak Lee,et al. More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server , 2013, NIPS.
[40] Maja Pantic,et al. T-Net: Parametrizing Fully Convolutional Nets With a Single High-Order Tensor , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Bo Li,et al. Round-Robin Synchronization: Mitigating Communication Bottlenecks in Parameter Servers , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.
[42] Tieniu Tan,et al. A Light CNN for Deep Face Representation With Noisy Labels , 2015, IEEE Transactions on Information Forensics and Security.
[43] Ioannis Mitliagkas,et al. Asynchrony begets momentum, with an application to deep learning , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[44] Sanjiv Kumar,et al. On the Convergence of Adam and Beyond , 2018 .
[45] James Demmel,et al. ImageNet Training in Minutes , 2017, ICPP.
[46] Kaibin Huang,et al. Towards an Intelligent Edge: Wireless Communication Meets Machine Learning , 2018, ArXiv.
[47] Wei Zhang,et al. AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training , 2017, AAAI.
[48] Xu Sun,et al. Adaptive Gradient Methods with Dynamic Bound of Learning Rate , 2019, ICLR.
[49] William J. Dally,et al. Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training , 2017, ICLR.
[50] H. Vincent Poor,et al. Convergence Time Optimization for Federated Learning Over Wireless Networks , 2020, IEEE Transactions on Wireless Communications.
[51] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[52] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[53] Alexander J. Smola,et al. Communication Efficient Distributed Machine Learning with the Parameter Server , 2014, NIPS.
[54] Ning Qian,et al. On the momentum term in gradient descent learning algorithms , 1999, Neural Networks.
[55] Xin Pan,et al. A hybrid MLP-CNN classifier for very fine resolution remotely sensed image classification , 2017, ISPRS Journal of Photogrammetry and Remote Sensing.