Implementing OpenMP on a high performance embedded multicore MPSoC

In this paper we discuss our initial experiences adapting OpenMP to enable it to serve as a programming model for high performance embedded systems. A high-level programming model such as OpenMP has the potential to increase programmer productivity, reducing the design/development costs and time to market for such systems. However, OpenMP needs to be extended if it is to meet the needs of embedded application developers, who require the ability to express multiple levels of parallelism, real-time and resource constraints, and to provide additional information in support of optimization. It must also be capable of supporting the mapping of different software tasks, or components, to the devices configured in a given architecture.

[1]  Eric Stotzer,et al.  Compilation Strategies for Reducing Code Size on a VLIW Processor with Variable Length Instructions , 2008, HiPEAC.

[2]  Ernst L. Leiss,et al.  Modulo scheduling for the TMS320C6x VLIW DSP architecture , 1999, LCTES '99.

[3]  Gert Goossens,et al.  Embedded software in real-time signal processing systems: application and architecture trends , 1997 .

[4]  Barbara M. Chapman,et al.  Towards an Implementation of the OpenMP Collector API , 2007, PARCO.

[5]  M. Gonzalez,et al.  Exploiting pipelined executions in OpenMP , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[6]  Hamid Laga,et al.  CUDA (Computer Unified Device Architecture) , 2009 .

[7]  Michael I. Gordon,et al.  Language and Compiler Design for Streaming Applications , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[8]  Alejandro Duran,et al.  A Proposal for Task Parallelism in OpenMP , 2007, IWOMP.

[9]  Bronis R. de Supinski,et al.  Toward Enhancing OpenMP's Work-Sharing Directives , 2006, Euro-Par.

[10]  Grant Martin,et al.  Overview of the MPSoC design challenge , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[11]  Mats Brorsson,et al.  OdinMP/CCp - a portable implementation of OpenMP for C , 2000, Concurr. Pract. Exp..

[12]  Barbara M. Chapman,et al.  Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications , 2006, IWOMP.

[13]  Soonhoi Ha,et al.  Effective OpenMP Implementation and Translation For Multiprocessor System-On-Chip without Using OS , 2007, 2007 Asia and South Pacific Design Automation Conference.

[14]  Paul M. Carpenter,et al.  A Streaming Machine Description and Programming Model , 2007, SAMOS.

[15]  Benedict R. Gaster,et al.  Exploiting Loop-Level Parallelism for SIMD Arrays Using OpenMP , 2007, IWOMP.

[16]  Pat Hanrahan,et al.  Brook for GPUs: stream computing on graphics hardware , 2004, ACM Trans. Graph..

[17]  Albert Cohen,et al.  ACOTES: Advanced Compiler Technologies for Embedded Streaming Submission to the Special Issue on European HiPEAC NoE Member's Projects , 2009 .

[18]  Michael Gschwind,et al.  Optimizing Compiler for the CELL Processor , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[19]  Jens H. Krüger,et al.  GPGPU: general purpose computation on graphics hardware , 2004, SIGGRAPH '04.

[20]  Wenguang Chen,et al.  OpenUH: an optimizing, portable OpenMP compiler , 2007, Concurr. Comput. Pract. Exp..

[21]  Feng Liu,et al.  A Practical OpenMP Compiler for System on Chips , 2003, WOMPAT.