A low-power multi-core media co-processor for mobile application processors

A multi-core co-processor for mobile application processors is introduced. It provides low-power, high-throughput, fully software-based acceleration of multimedia processing. The test chip fabricated in a 65nm CMOS technology consumes 620mW in H.264 720p 60fps decoding and 9.7mW in MPEG-4 AAC decoding. In the maximum workload of H.264 decoding, a symmetrical parallelization achieves 7.5× performance enhancement by 8 cores. The shared L2 cache reduces the required rate of main memory access to 310MB/s. In the minimum workload of AAC decoding, three low-power circuit techniques reduce 98% of leakage. On-chip regulators, which also work as power-gating switches, lower the supply voltage of processing cores. Embedded forward body-biasing circuit reduces Vt variations. A low-power and fast data-mapping F/F relaxes the timing constraint, which enables a reduction in the number of low-Vt transistors.

[1]  A. Suga,et al.  A 51.2 GOPS 1.0 GB/s-DMA single-chip multi-processor integrating quadruple 8-way VLIW processors , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[2]  Y. Arima,et al.  A 9/spl mu/W 50MHz 32b adder using a self-adjusted forward body bias in SoCs , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..

[3]  S. Asano,et al.  The design and implementation of a first-generation CELL processor , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[4]  H. Hanaki,et al.  A 250 MHz single-chip multiprocessor for A/V signal processing , 2001, 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177).

[5]  Mototsugu Hamada,et al.  A process variation compensation scheme using cell-based forward body-biasing circuits usable for 1.2V design , 2008, 2008 IEEE Custom Integrated Circuits Conference.

[6]  Tatsuya Mori,et al.  Design and implementation of scalable, transparent threads for multi-core media processor , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[7]  Tatsuya Mori,et al.  A Power, Performance Scalable Eight-Cores Media Processor for Mobile Multimedia Applications , 2009, IEEE Journal of Solid-State Circuits.

[8]  Chen Kong Teh,et al.  A 9.7mW AAC-Decoding, 620mW H.264 720p 60fps Decoding, 8-Core Media Processor with Embedded Forward-Body-Biasing and Power-Gating Circuit in 65nm CMOS Technology , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[9]  Mototsugu Hamada,et al.  Conditional Data Mapping Flip-Flops for Low-Power and High-Performance Systems , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.