Optimizing Deep Learning Based Semantic Video Segmentation on Embedded GPUs

Decision making in many industries today is being improved drastically thanks to artificial intelligence and deep learning. New algorithms address challenges such as genome mapping, medical diagnostics, self-driving cars, autonomous robots and more. Deep learning in embedded systems requires high optimization due to the high computational demand, given that power, heat dissipation, size and price constraints are numerous. In this paper we analyze several acceleration methods which include utilization of GPUs for most complex variants of deep learning, such as semantic video segmentation operating in real time. Specifically, we propose mapping of acceleration routines commonly present within deep learning SDKs to different network layers in semantic segmentation. Finally, we evaluate one implementation utilizing the enumerated techniques for semantic segmentation of front camera in autonomous driving front view.