A first look into the carbon footprint of federated learning

Despite impressive results, deep learning-based technologies also raise severe privacy and environmental concerns induced by the training procedure often conducted in data centers. In response, alternatives to centralized training such as Federated Learning (FL) have emerged. Perhaps unexpectedly, FL in particular is starting to be deployed at a global scale by companies that must adhere to new legal demands and policies originating from governments and the civil society for privacy protection. However, the potential environmental impact related to FL remains unclear and unexplored. This paper offers the first-ever systematic study of the carbon footprint of FL. First, we propose a rigorous model to quantify the carbon footprint, hence facilitating the investigation of the relationship between FL design and carbon emissions. Then, we compare the carbon footprint of FL to traditional centralized learning. We also formalize an early-stage FL optimization problem enabling the community to consider the importance of optimizing the rate of CO2 emissions jointly to the accuracy of neural networks. Finally, we highlight and connect the reported results to the future challenges and trends in FL to reduce its environmental impact, including algorithms efficiency, hardware capabilities, and stronger industry transparency.

[1]  Helen Shen,et al.  Smartphones set to boost large-scale health studies , 2015, Nature.

[2]  Manfred Reichert,et al.  Mobile Crowd Sensing in Clinical and Psychological Trials -- A Case Study , 2015, 2015 IEEE 28th International Symposium on Computer-Based Medical Systems.

[3]  Lars Kai Hansen,et al.  Towards Federated Learning: Robustness Analytics to Data Heterogeneity , 2020, ArXiv.

[4]  Amar Phanishayee,et al.  The Non-IID Data Quagmire of Decentralized Machine Learning , 2019, ICML.

[5]  Jean-Pascal van Ypersele de Strihou Climate Change 2014 - Synthesis Report , 2015 .

[6]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[7]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[8]  Raghavendra Selvan,et al.  Carbontracker: Tracking and Predicting the Carbon Footprint of Training Deep Learning Models , 2020, ArXiv.

[9]  Keywan Riahi,et al.  Synthesis report: summary for policy makers , 2014 .

[10]  Keqiu Li,et al.  Energy Consumption in Cloud Computing Data Centers , 2014, CloudCom 2014.

[11]  Crowley,et al.  Atmospheric science: Methane rises from wetlands , 2011, Nature.

[12]  Miro Hodak,et al.  Towards Power Efficiency in Deep Learning on Data Center Hardware , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[13]  Alexandre Lacoste,et al.  Quantifying the Carbon Emissions of Machine Learning , 2019, ArXiv.

[14]  E. Topol,et al.  Unpatients—why patients should own their medical data , 2015, Nature Biotechnology.

[15]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[16]  Alfonso Capozzoli,et al.  Cooling Systems in Data Centers: State of Art and Emerging Technologies , 2015 .

[17]  Andrew McCallum,et al.  Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.

[18]  Ying-Chang Liang,et al.  Federated Learning in Mobile Edge Networks: A Comprehensive Survey , 2020, IEEE Communications Surveys & Tutorials.

[19]  David Costenaro,et al.  The Megawatts behind Your Megabytes: Going from Data-Center to Desktop , 2012 .

[20]  Flávio Miguel Varejão,et al.  Monthly energy consumption forecast: A deep learning approach , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[21]  Kevin Anderson,et al.  The inconvenient truth of carbon offsets , 2012, Nature.

[22]  Takayuki Nishio,et al.  Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[23]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[24]  Titouan Parcollet,et al.  Flower: A Friendly Federated Learning Research Framework , 2020, ArXiv.

[25]  Peter Henderson,et al.  Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning , 2020, ArXiv.

[26]  Xiang Li,et al.  On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.