Towards an Automatic Prediction of Image Processing Algorithms Performances on Embedded Heterogeneous Architectures

Image processing algorithms are widely used in the automotive field for ADAS (Advanced Driver Assistance System) purposes. To embed these algorithms, semiconductor companies offer heterogeneous architectures which are composed of different processing units, often with massively parallel computing unit. However, embedding complex algorithms on these So Cs (System on Chip) remains a difficult task due to heterogeneity, it is not easy to decide how to allocate parts of a given algorithm on processing units of a given SoC. In order to help automotive industry in embedding algorithms on heterogeneous architectures, we propose a novel approach to predict performances of image processing algorithms on different computing units of a given heterogeneous SoC. Our methodology is able to predict a more or less wide interval of execution time with a degree of confidence using only high level description of algorithms to embed, and a few characteristics of computing units.

[1]  Henk Corporaal,et al.  The boat hull model: enabling performance prediction for parallel computing prior to code development , 2012, CF '12.

[2]  Amnon Shashua,et al.  A Computer Vision System on a Chip: a case study from the automotive domain , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[3]  Massimo Bertozzi,et al.  GOLD: a parallel real-time stereo vision system for generic obstacle and lane detection , 1998, IEEE Trans. Image Process..

[4]  Jun Zhou,et al.  Use of SIMD Vector Operations to Accelerate Application Code Performance on Low-Powered ARM and Intel Platforms , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[5]  D. Naishlos,et al.  Autovectorization in GCC , 2004 .

[6]  O. Mano,et al.  Forward collision warning with a single camera , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[7]  Kari Pulli,et al.  Addressing System-Level Optimization with OpenVX Graphs , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[8]  Zoran Nikolic,et al.  TDA2X, a SoC optimized for advanced driver assistance systems , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Sudhakar Yalamanchili,et al.  Modeling GPU-CPU workloads and systems , 2010, GPGPU-3.

[10]  Hyesoon Kim,et al.  An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness , 2009, ISCA '09.

[11]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Henk Corporaal,et al.  A modular and parameterisable classification of algorithms , 2011 .

[13]  Lieven Eeckhout,et al.  Performance prediction based on inherent program similarity , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[14]  David Gerónimo Gómez,et al.  Survey of Pedestrian Detection for Advanced Driver Assistance Systems , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  David R. Kaeli,et al.  Multi2Sim: A simulation framework for CPU-GPU computing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[16]  Mark J. Harris Mapping computational concepts to GPUs , 2005, SIGGRAPH Courses.

[17]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[18]  Alexander Mendiburu,et al.  A Survey of Performance Modeling and Simulation Techniques for Accelerator-Based Computing , 2015, IEEE Transactions on Parallel and Distributed Systems.

[19]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[20]  Cong Liu,et al.  Task mapping in heterogeneous embedded systems for fast completion time , 2014, 2014 International Conference on Embedded Software (EMSOFT).