Poster Abstract: DeepRT: A Predictable Deep Learning Inference Framework for IoT Devices

Recently, deep learning is emerging as a state-of-the-art approach in delivering robust and highly accurate inference in many domains, including Internet-of-Things (IoT). Deep learning is already changing the way computers embedded in IoT devices to make intelligent decisions using sensor feeds in the real world. There have been significant efforts to develop light-weight and highly efficient deep learning inference mechanisms for resource-constrained mobile and IoT devices. Some approaches propose a hardware-based accelerator, and some approaches propose to reduce the amount of computation of deep learning models using various model compression techniques. Even though these efforts have demonstrated significant gains in performance and efficiency, they are not aware of the Quality-of-Service (QoS) requirements of various IoT applications, and, hence manifest unpredictable 'best-effort' performance in terms of inference latency, power consumption, resource usage, etc. In IoT devices with temporal constraints, such unpredictability might result in undesirable effects such as compromising safety. In this work, we present a novel deep learning inference runtime called, DeepRT. Unlike previous inference accelerators, DeepRT focuses on supporting predictable inference performance both temporally and spatially.

[1]  Nicholas D. Lane,et al.  DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices , 2016, 2016 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN).