Train++: An Incremental ML Model Training Algorithm to Create Self-Learning IoT Devices

The majority of Internet of Things (IoT) devices are tiny embedded systems with a micro-controller unit (MCU) as its brain. The memory footprint (SRAM, Flash, and EEPROM) of such MCU-based devices is often very limited, restricting onboard Machine Learning (ML) model training for large trainsets with high feature dimensions. To cope with memory issues, the current edge analytics approaches train high-quality ML models on the cloud GPUs (uses large volume historical data), then deploy the deep optimized version of the resultant models on edge devices for inference. Such approaches are inefficient in concept drift situations where the data generated at the device level vary frequently, and trained models are clueless on how to behave if previously unseen data arrives. In this paper, we present Train++, an incremental training algorithm that trains ML models locally at the device level (e.g., on MCUs and small CPUs) using the full n-samples of high-dimensional data. Train++ transforms even the most resource-constrained MCU-based IoT edge devices into intelligent devices that can locally build their own knowledge base on-the-fly using the live data, thus creating smart self-learning and autonomous problem-solving devices. Train++ algorithm is extensively evaluated on 5 popular MCU-boards, using 7 datasets of varying sizes and feature dimensions. A few exciting findings when analyzing the evaluation results are: (i) The proposed method reduces the onboard binary classifier training time by ≈ 10 - 226 sec across various commodity MCUs; (ii) Train++ can infer on MCUs for the entire test set in real-time of 1 ms; (iii) The accuracy improved by 5.15 - 7.3% since the incremental characteristic of Train++ enabled the loading of full n-samples of the high-dimensional datasets even on MCUs with only a few hundred kBs of memory.

[1]  Peter Corcoran,et al.  Smart Speaker Design and Implementation with Biometric Authentication and Advanced Voice Interaction Capability , 2022, AICS.

[2]  John G. Breslin,et al.  ElastiCL: Elastic Quantization for Communication Efficient Collaborative Learning in IoT , 2021, SenSys.

[3]  John G. Breslin,et al.  Globe2Train: A Framework for Distributed ML Model Training using IoT Devices Across the Globe , 2021, 2021 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/IOP/SCI).

[4]  John G. Breslin,et al.  Air Quality Sensor Network Data Acquisition, Cleaning, Visualization, and Analytics: A Real-world IoT Use Case , 2021, UbiComp/ISWC Adjunct.

[5]  John G. Breslin,et al.  Enabling Machine Learning on the Edge Using SRAM Conserving Efficient Neural Networks Execution Approach , 2021, ECML/PKDD.

[6]  John G. Breslin,et al.  An SRAM Optimized Approach for Constant Memory Consumption and Ultra-fast Execution of ML Classifiers on TinyML Hardware , 2021, 2021 IEEE International Conference on Services Computing (SCC).

[7]  John G. Breslin,et al.  ML-MCU: A Framework to Train ML Classifiers on MCU-Based IoT Edge Devices , 2021, IEEE Internet of Things Journal.

[8]  John G. Breslin,et al.  OWSNet: Towards Real-time Offensive Words Spotting Network for Consumer IoT Devices , 2021, 2021 IEEE 7th World Forum on Internet of Things (WF-IoT).

[9]  John G. Breslin,et al.  SRAM optimized porting and execution of machine learning classifiers on MCU-based IoT devices: demo abstract , 2021, ICCPS.

[10]  Prem Prakash Jayaraman,et al.  Toward Distributed, Global, Deep Learning Using IoT Devices , 2021, IEEE Internet Computing.

[11]  Edward Curry,et al.  VID-WIN: Fast Video Event Matching With Query-Aware Windowing at the Edge for the Internet of Multimedia Things , 2021, IEEE Internet of Things Journal.

[12]  John G. Breslin,et al.  Edge2Guard: Botnet Attacks Detecting Offline Models for Resource-Constrained IoT Devices , 2021, 2021 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops).

[13]  T. Runkler,et al.  TinyOL: TinyML with Online-Learning on Microcontrollers , 2021, 2021 International Joint Conference on Neural Networks (IJCNN).

[14]  Muhammad Intizar Ali,et al.  Edge2Train: a framework to train machine learning models (SVMs) on resource-constrained IoT edge devices , 2020, IOT.

[15]  Muhammad Intizar Ali,et al.  Avoid Touching Your Face: A Hand-to-face 3D Motion Dataset (COVID-away) and Trained Models for Smartwatches , 2020, IOT Companion.

[16]  Mohammad Saniee Abadeh,et al.  SEFR: A Fast Linear-Time Classifier for Ultra-Low Power Devices , 2020, ArXiv.

[17]  Muhammad Intizar Ali,et al.  Adaptive Strategy to Improve the Quality of Communication for IoT Edge Devices , 2020, 2020 IEEE 6th World Forum on Internet of Things (WF-IoT).

[18]  Bharathi Raja Chakravarthi,et al.  Unsupervised Method to Analyze Playing Styles of EPL Teams using Ball Possession-position Data , 2020, 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS).

[19]  Michael Garland,et al.  A Programmable Approach to Model Compression , 2019, ArXiv.

[20]  Jon Crowcroft,et al.  Privacy-Preserving Machine Learning Based Data Analytics on Edge Devices , 2018, AIES.

[21]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Dan Alistarh,et al.  Model compression via distillation and quantization , 2018, ICLR.

[23]  Vikas Chandra,et al.  CMSIS-NN: Efficient Neural Network Kernels for Arm Cortex-M CPUs , 2018, ArXiv.

[24]  Kilian Q. Weinberger,et al.  CondenseNet: An Efficient DenseNet Using Learned Group Convolutions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Saurabh Goyal,et al.  Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things , 2017, ICML.

[26]  Prateek Jain,et al.  ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices , 2017, ICML.

[27]  Goutham Kamath,et al.  Pushing Analytics to the Edge , 2016, 2016 IEEE Global Communications Conference (GLOBECOM).

[28]  Andreas Spanias,et al.  Integrating machine learning in embedded sensor systems for Internet-of-Things applications , 2016, 2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[29]  Nicholas D. Lane,et al.  Sparsification and Separation of Deep Learning Layers for Constrained Resource Inference on Wearables , 2016, SenSys.

[30]  Steven C. H. Hoi,et al.  Online Passive-Aggressive Active learning , 2016, Machine Learning.

[31]  Soheil Ghiasi,et al.  Hardware-oriented Approximation of Convolutional Neural Networks , 2016, ArXiv.

[32]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[33]  Cheng-Hao Tsai,et al.  Incremental and decremental training for linear classification , 2014, KDD.

[34]  Slobodan Vucetic,et al.  Online Passive-Aggressive Algorithms on a Budget , 2010, AISTATS.

[35]  Fei-Fei Li,et al.  OPTIMOL: Automatic Online Picture Collection via Incremental Model Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Claudio Gentile,et al.  A New Approximate Maximal Margin Classification Algorithm , 2002, J. Mach. Learn. Res..

[37]  Mark Herbster,et al.  Learning Additive Models Online with Fast Evaluating Kernels , 2001, COLT/EuroCOLT.

[38]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[39]  Yi Li,et al.  The Relaxed Online Maximum Margin Algorithm , 1999, Machine Learning.

[40]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[41]  N. Littlestone Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .

[42]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[43]  B. Sudharsan Machine Learning Meets Internet of Things: From Theory to Practice , 2021 .

[44]  Zhengguo Li,et al.  Model-Based Online Learning With Kernels , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[45]  Lei Wang,et al.  Fuzzy Passive-Aggressive classification: A robust and efficient algorithm for online classification problems , 2013, Inf. Sci..

[46]  I. J. Schoenberg,et al.  The Relaxation Method for Linear Inequalities , 1954, Canadian Journal of Mathematics.