Parallel Processing of Sequential Media Algorithms on Heterogeneous Multi-Processor System-on-Chip

Heterogeneous Multi-Processor System-on-Chip (MPSoC) and media processing are comprehensively applied in mobile electronic commerce. And heterogeneous MPSoCs provides more opportunities for parallelization accelerating of sequential media algorithms. However, the parallelization researches of heterogeneous MPSoC applications lags far behind the development of MPSoC hardware platform. Therefore, utilizing parallelization opportunity of MPSoC for improving performance and efficiency of media applications has been one of the hottest researches in the field of embedded system. This paper proposes a new approach that parallelizes sequential media algorithms on heterogeneous MPSoC using program transformation and application-to-architecture mapping techniques. Data locality and communication cost are optimized during the parallel processing. Moreover, the difference between processing elements, reflected in architecture templates, is used to achieve “the maximum” performance and efficiency of heterogeneous MPSoCs. Finally, an experiment shows the proposed approach can obtain approximate or better accelerating than the manual parallel processing by experienced designers.

[1]  Yong Dou,et al.  Collaborative hardware/software partition of coarse-grained reconfigurable system using evolutionary ant colony optimization , 2008, 2008 Asia and South Pacific Design Automation Conference.

[2]  Santhosh Kumar Pilakkat,et al.  Task Mapping in Heterogeneous MPSoCs for System Level Design , 2008, 13th IEEE International Conference on Engineering of Complex Computer Systems (iceccs 2008).

[3]  Henry Hoffmann,et al.  A stream compiler for communication-exposed architectures , 2002, ASPLOS X.

[4]  Luca Benini,et al.  A Fast and Accurate Technique for Mapping Parallel Applications on Stream-Oriented MPSoC Platforms with Communication Awareness , 2007, International Journal of Parallel Programming.

[5]  Lothar Thiele,et al.  Mapping Applications to Tiled Multiprocessor Embedded Systems , 2007, Seventh International Conference on Application of Concurrency to System Design (ACSD 2007).

[6]  Thomas Bäck,et al.  The zero/one multiple knapsack problem and genetic algorithms , 1994, SAC '94.

[7]  Frank Vahid,et al.  Profiling tools for hardware/software partitioning of embedded applications , 2003, LCTES.

[8]  Luciano Lavagno,et al.  Metropolis: An Integrated Electronic System Design Environment , 2003, Computer.

[9]  Sikun Li,et al.  A Heterogeneous Multicore SoC Optimized for Embedded Visual Media Process , 2009, 2009 WRI International Conference on Communications and Mobile Computing.

[10]  Peng Zhao,et al.  Application-driven System-on-Chip system model extraction approach , 2008, 2008 12th International Conference on Computer Supported Cooperative Work in Design.

[11]  Thomas Stützle,et al.  MAX-MIN Ant System , 2000, Future Gener. Comput. Syst..

[12]  Andy D. Pimentel,et al.  Multiobjective optimization and evolutionary algorithms for the application mapping problem in multiprocessor system-on-chip design , 2006, IEEE Transactions on Evolutionary Computation.

[13]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[14]  Steven W. K. Tjiang,et al.  SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.

[15]  Andy D. Pimentel,et al.  A systematic approach to exploring embedded system architectures at multiple abstraction levels , 2006, IEEE Transactions on Computers.

[16]  Ed F. Deprettere,et al.  Exploring Embedded-Systems Architectures with Artemis , 2001, Computer.

[17]  Pierre G. Paulin Automatic mapping of parallel applications onto multi-processor platforms: a multimedia application , 2004, Euromicro Symposium on Digital System Design, 2004. DSD 2004..

[18]  Keshav Pingali,et al.  Optimistic parallelism requires abstractions , 2007, PLDI '07.

[19]  Fernando Gehm Moraes,et al.  Heuristics for Dynamic Task Mapping in NoC-based Heterogeneous MPSoCs , 2007, 18th IEEE/IFIP International Workshop on Rapid System Prototyping (RSP '07).

[20]  François Irigoin,et al.  Supernode partitioning , 1988, POPL '88.

[21]  T. Wiangtong,et al.  Hardware/software codesign: a systematic approach targeting data-intensive applications , 2005, IEEE Signal Processing Magazine.

[22]  Uday Bondhugula,et al.  Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model , 2008, CC.

[23]  Francesco Poletti,et al.  Communication-aware allocation and scheduling framework for stream-oriented multi-processor systems-on-chip , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[24]  Rainer Leupers,et al.  MAPS: An integrated framework for MPSoC application parallelization , 2008, 2008 45th ACM/IEEE Design Automation Conference.