Heterogeneous multiprocessor implementations for JPEG:: a case study

Heteregenous multiprocessor SoCs are becoming a reality, largely due to the abundance of transistors, intellectual property cores and powerful design tools. In this project, we explore the use of multiple cores to speed up the JPEG compression algorithm. We show two methods to parallelize this algorithm: one, a master-slave model; and two, a pipeline model. The systems were implemented using Tensilica's Xtensa LX processors with queues. We show that even with this relatively simple application, parallelization can be carried out with up to nine processors with utilization of between 50% to 80%. We obtained speed ups of up to 4.6X with a seven core system with an area increase of 3. 1X.

[1]  Anthony A. Maciejewski,et al.  Heterogeneous Computing: Goals, Methods, and Open Problems , 2001, HiPC.

[2]  Hironori Kasahara,et al.  Multigrain parallel processing for JPEG encoding on a single chip multiprocessor , 2002, International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems.

[3]  Jakob Axelsson Volvo A Case Study in Heterogeneous Implementation of Automotive Real-Time Systems , .

[4]  Kiyoung Choi,et al.  Loop pipelining in hardware-software partitioning , 1998, Proceedings of 1998 Asia and South Pacific Design Automation Conference.

[5]  Ning Zhang,et al.  Study on adaptive job assignment for multiprocessor implementation of MPEG2 video encoding , 1997, IEEE Trans. Ind. Electron..

[6]  J.L. van Meerbergen,et al.  Heterogeneous multiprocessor for the management of real-time video and graphics streams , 2000, IEEE Journal of Solid-State Circuits.

[7]  S. Baruah,et al.  Task partitioning upon heterogeneous multiprocessor platforms , 2004, Proceedings. RTAS 2004. 10th IEEE Real-Time and Embedded Technology and Applications Symposium, 2004..

[8]  Paul M. Chau,et al.  Macro pipelining based scheduling on high performance heterogeneous multiprocessor systems , 1995, IEEE Trans. Signal Process..

[9]  A. Beric,et al.  Heterogeneous multiprocessor for high definition video , 2006, 2006 Digest of Technical Papers International Conference on Consumer Electronics.

[10]  Norman P. Jouppi,et al.  Heterogeneous chip multiprocessors , 2005, Computer.

[11]  Ranga Vemuri,et al.  A tool for partitioning and pipelined scheduling of hardware-software systems , 1998, Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210).

[12]  Rainer Leupers,et al.  System level processor/communication co-exploration methodology for multiprocessor system-on-chip platforms , 2005 .

[13]  Heinrich Meyr,et al.  LISA-machine description language and generic machine model for HW/SW co-design , 1996, VLSI Signal Processing, IX.

[14]  Marco Caccamo,et al.  Task Partitioning with Replication upon Heterogeneous Multiprocessor Systems , 2006, 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'06).

[15]  Srivaths Ravi,et al.  Custom-instruction synthesis for extensible-processor platforms , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[16]  S. Asano,et al.  The design and implementation of a first-generation CELL processor , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[17]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[18]  Daewook Kim,et al.  MPEG-4 performance analysis for a CDMA network-on-chip , 2005, Proceedings. 2005 International Conference on Communications, Circuits and Systems, 2005..

[19]  Eric Hamilton JPEG File Interchange Format , 2004 .