Decision making in many industries today is being improved drastically thanks to artificial intelligence and deep learning. New algorithms address challenges such as genome mapping, medical diagnostics, self-driving cars, autonomous robots and more. Deep learning in embedded systems requires high optimization due to the high computational demand, given that power, heat dissipation, size and price constraints are numerous. In this paper we analyze several acceleration methods which include utilization of GPUs for most complex variants of deep learning, such as semantic video segmentation operating in real time. Specifically, we propose mapping of acceleration routines commonly present within deep learning SDKs to different network layers in semantic segmentation. Finally, we evaluate one implementation utilizing the enumerated techniques for semantic segmentation of front camera in autonomous driving front view.
[1]
François Chollet,et al.
Keras: The Python Deep Learning library
,
2018
.
[2]
Marcus M. Scheunemann,et al.
Deep Learning for Semantic Segmentation on Minimal Hardware
,
2018,
RoboCup.
[3]
Sparsh Mittal,et al.
A Survey on optimized implementation of deep learning models on the NVIDIA Jetson platform
,
2019,
J. Syst. Archit..
[4]
Ali Farhadi,et al.
You Only Look Once: Unified, Real-Time Object Detection
,
2015,
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).