Exploiting Unlabeled Data in Smart Cities using Federated Learning

Privacy concerns are considered one of the main challenges in smart cities as sharing sensitive data brings threatening problems to people's lives. Federated learning has emerged as an effective technique to avoid privacy infringement as well as increase the utilization of the data. However, there is a scarcity in the amount of labeled data and an abundance of unlabeled data collected in smart cities, hence there is a need to use semi-supervised learning. We propose a semi-supervised federated learning method called FedSem that exploits unlabeled data. The algorithm is divided into two phases where the first phase trains a global model based on the labeled data. In the second phase, we use semi-supervised learning based on the pseudo labeling technique to improve the model. We conducted several experiments using traffic signs dataset to show that FedSem can improve accuracy up to 8% by utilizing the unlabeled data in the learning process.

[1]  Ho-Jin Choi,et al.  Pseudo-Labeling Using Gaussian Process for Semi-Supervised Deep Learning , 2018, 2018 IEEE International Conference on Big Data and Smart Computing (BigComp).

[2]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[3]  Yann LeCun,et al.  Traffic sign recognition with multi-scale Convolutional Networks , 2011, The 2011 International Joint Conference on Neural Networks.

[4]  Ohad Shamir,et al.  Optimal Distributed Online Prediction Using Mini-Batches , 2010, J. Mach. Learn. Res..

[5]  Zengfu Wang,et al.  Real-Time Traffic Sign Recognition Based on Efficient CNNs in the Wild , 2019, IEEE Transactions on Intelligent Transportation Systems.

[6]  Albert Y. Zomaya,et al.  Federated Learning over Wireless Networks: Optimization Model Design and Analysis , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[7]  Ohad Shamir,et al.  Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.

[8]  Ralf Tönjes,et al.  CityPulse: Large Scale Data Analytics Framework for Smart Cities , 2016, IEEE Access.

[9]  N. G. Nageswari Amma,et al.  Privacy Preserving Data Mining Classifier for Smart City Applications , 2018, 2018 3rd International Conference on Communication and Electronics Systems (ICCES).

[10]  Tao Lin,et al.  Don't Use Large Mini-Batches, Use Local SGD , 2018, ICLR.

[11]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[12]  A. Staravoitau Traffic Sign Classification with a Convolutional Network , 2018 .

[13]  Zhu Han,et al.  Federated Learning for Edge Networks: Resource Optimization and Incentive Mechanism , 2019, IEEE Communications Magazine.

[14]  Yue Zhang,et al.  Sensing and Classifying Roadway Obstacles in Smart Cities: The Street Bump System , 2016, IEEE Access.

[15]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[16]  Ying-Chang Liang,et al.  Federated Learning in Mobile Edge Networks: A Comprehensive Survey , 2020, IEEE Communications Surveys & Tutorials.

[17]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[18]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[19]  Georgios B. Giannakis,et al.  LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning , 2018, NeurIPS.

[20]  Michael I. Jordan,et al.  CoCoA: A General Framework for Communication-Efficient Distributed Optimization , 2016, J. Mach. Learn. Res..

[21]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.