libEOMP: a portable OpenMP runtime library based on MCA APIs for embedded systems

In recent years rapid revolution of Multiprocessor System-on-Chip (MPSoC) poses new challenges for programming such architectures in an efficient manner. In order to explore potential hardware concurrency, software developers are still expected to handle many of the low-level details of programming including utilizing DMA, ensuring cache co-herency, and inserting synchronization primitives explicitly. Software portability is yet another issue: the state-of-the-art is that hardware vendors supply vendor-specific software development toolchains which makes it harder for applications to be ported to many different possible architectures without re-structuring the code, while at the same time ensuring efficiency. In this paper, we extend the usage of a high-level programming model, OpenMP, to multicore embedded systems. To address the architectural challenges, we propose a lightweight unified OpenMP runtime library, libEOMP, by leveraging the MCA (Multicore Association) APIs as the target of our OpenMP translation. MCA APIs support device-level communication and resource management for multicore embedded systems. We have implemented and evaluated libEOMP on an embedded platform supplied by Freescale Semiconductor. We observed that libEOMP not only performed as well as optimized vendor-specific OpenMP runtime libraries but also achieved better portability, programmability and productivity.

[1]  Barbara M. Chapman,et al.  Scalability Evaluation of Barrier Algorithms for OpenMP , 2009, IWOMP.

[2]  Andrew Richards,et al.  Offload - Automating Code Migration to Heterogeneous Multicore Systems , 2010, HiPEAC.

[3]  Wenguang Chen,et al.  OpenMDSP: Extending OpenMP to Program Multi-Core DSP , 2011, PACT.

[4]  Mitsuhisa Sato,et al.  Design of OpenMP Compiler for an SMP Cluster , 1999 .

[5]  Rudolf Eigenmann,et al.  OpenMPC: Extended OpenMP Programming and Tuning for GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[6]  Douglas McCormick,et al.  Version 3.0. , 2009, BioTechniques.

[7]  Tao Zhang,et al.  Supporting OpenMP on Cell , 2008, International Journal of Parallel Programming.

[8]  Kathryn M. O'Brien,et al.  Supporting OpenMP on cell , 2008 .

[9]  J. M. Bull,et al.  Measuring Synchronisation and Scheduling Overheads in OpenMP , 2007 .

[10]  Changjun Hu,et al.  Support for OpenMP Tasks on Cell Architecture , 2010, ICA3PP.

[11]  Barbara M. Chapman,et al.  Implementing OpenMP on a high performance embedded multicore MPSoC , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[12]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[13]  Barbara M. Chapman,et al.  A Runtime Implementation of OpenMP Tasks , 2011, IWOMP.

[14]  David Pellerin,et al.  Practical FPGA programming in C , 2005 .

[15]  Robert A. van de Geijn,et al.  Unleashing the high-performance and low-power of multi-core DSPs for general-purpose HPC , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[16]  Krisztián Flautner,et al.  SoC-C: efficient programming abstractions for heterogeneous multicore systems on chip , 2008, CASES '08.

[17]  Wenguang Chen,et al.  OpenUH: an optimizing, portable OpenMP compiler , 2007, Concurr. Comput. Pract. Exp..

[18]  Mitsuhisa Sato,et al.  Evaluation of Multicore Processors for Embedded Systems by Parallel Benchmark Program Using OpenMP , 2009, IWOMP.