Semisupervised Deep Reinforcement Learning in Support of IoT and Smart City Services

Smart services are an important element of the smart cities and the Internet of Things (IoT) ecosystems where the intelligence behind the services is obtained and improved through the sensory data. Providing a large amount of training data is not always feasible; therefore, we need to consider alternative ways that incorporate unlabeled data as well. In recent years, deep reinforcement learning (DRL) has gained great success in several application domains. It is an applicable method for IoT and smart city scenarios where auto-generated data can be partially labeled by users’ feedback for training purposes. In this paper, we propose a semisupervised DRL model that fits smart city applications as it consumes both labeled and unlabeled data to improve the performance and accuracy of the learning agent. The model utilizes variational autoencoders as the inference engine for generalizing optimal policies. To the best of our knowledge, the proposed model is the first investigation that extends DRL to the semisupervised paradigm. As a case study of smart city applications, we focus on smart buildings and apply the proposed model to the problem of indoor localization based on Bluetooth low energy signal strength. Indoor localization is the main component of smart city services since people spend significant time in indoor environments. Our model learns the best action policies that lead to a close estimation of the target locations with an improvement of 23% in terms of distance to the target and at least 67% more received rewards compared to the supervised DRL model.

[1]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[2]  Dongbin Zhao,et al.  Deep Reinforcement Learning With Visual Attention for Vehicle Classification , 2017, IEEE Transactions on Cognitive and Developmental Systems.

[3]  Martial Hebert,et al.  An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders , 2016, ECCV.

[4]  Mohsen Guizani,et al.  Internet of Things: A Survey on Enabling Technologies, Protocols, and Applications , 2015, IEEE Communications Surveys & Tutorials.

[5]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[6]  Ali Farhadi,et al.  Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[8]  Laurence T. Yang,et al.  Indoor smartphone localization via fingerprint crowdsourcing: challenges and approaches , 2016, IEEE Wireless Communications.

[9]  Srikanth Kandula,et al.  Resource Management with Deep Reinforcement Learning , 2016, HotNets.

[10]  Grzegorz Cielniak,et al.  Indoor positioning of shoppers using a network of Bluetooth Low Energy beacons , 2016, 2016 International Conference on Indoor Positioning and Indoor Navigation (IPIN).

[11]  Hiroshi Matsuo,et al.  Experiment of indoor position presumption based on RSSI of Bluetooth LE beacon , 2014, 2014 IEEE 3rd Global Conference on Consumer Electronics (GCCE).

[12]  Zhenhui Tan,et al.  Fingerprinting localization based on affinity propagation clustering and artificial neural networks , 2013, 2013 IEEE Wireless Communications and Networking Conference (WCNC).

[13]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[14]  Shiwen Mao,et al.  DeepFi: Deep learning for indoor fingerprinting using channel state information , 2015, 2015 IEEE Wireless Communications and Networking Conference (WCNC).

[15]  Jianfeng Gao,et al.  Deep Reinforcement Learning with a Natural Language Action Space , 2015, ACL.

[16]  W. H. Engelmann,et al.  The National Human Activity Pattern Survey (NHAPS): a resource for assessing exposure to environmental pollutants , 2001, Journal of Exposure Analysis and Environmental Epidemiology.

[17]  Teemu Roos,et al.  Semi-supervised Learning for WLAN Positioning , 2011, ICANN.

[18]  Luca Mainetti,et al.  A location-aware architecture for heterogeneous building automation systems , 2015, 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM).

[19]  Yue Liu,et al.  Bluetooth positioning using RSSI and triangulation methods , 2013, 2013 IEEE 10th Consumer Communications and Networking Conference (CCNC).

[20]  Jason Jianjun Gu,et al.  Deep Neural Networks for wireless localization in indoor and outdoor environments , 2016, Neurocomputing.

[21]  Xiaojin Zhu,et al.  Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.

[22]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[23]  Yiqiang Chen,et al.  Semi-supervised deep extreme learning machine for Wi-Fi based localization , 2015, Neurocomputing.

[24]  Damien Ernst,et al.  Deep Reinforcement Learning Solutions for Energy Microgrids Management , 2016 .

[25]  Regina Barzilay,et al.  Language Understanding for Text-based Games using Deep Reinforcement Learning , 2015, EMNLP.

[26]  Faheem Zafari,et al.  Microlocation for Internet-of-Things-Equipped Smart Buildings , 2015, IEEE Internet of Things Journal.

[27]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[28]  Li Li,et al.  Traffic signal timing via deep reinforcement learning , 2016, IEEE/CAA Journal of Automatica Sinica.

[29]  Robert Harle,et al.  Location Fingerprinting With Bluetooth Low Energy Beacons , 2015, IEEE Journal on Selected Areas in Communications.

[30]  Michael Gould,et al.  Enhancing integrated indoor/outdoor mobility in a smart campus , 2015, Int. J. Geogr. Inf. Sci..

[31]  Svetlana Lazebnik,et al.  Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Luca Mainetti,et al.  A survey on indoor positioning systems , 2014, 2014 22nd International Conference on Software, Telecommunications and Computer Networks (SoftCOM).

[33]  L. Mainetti,et al.  An Indoor Location-Aware System for an IoT-Based Smart Museum , 2016, IEEE Internet of Things Journal.

[34]  Junhai Luo,et al.  Deep Belief Networks for Fingerprinting Indoor Localization Using Ultrawideband Technology , 2016, Int. J. Distributed Sens. Networks.

[35]  Shamim Nemati,et al.  Optimal medication dosing from suboptimal clinical examples: A deep reinforcement learning approach , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[36]  Jason Weston,et al.  Deep learning via semi-supervised embedding , 2008, ICML '08.

[37]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[38]  Xiao Zhang,et al.  Device-free wireless localization and activity recognition with deep learning , 2016, 2016 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops).