Key features of the design methodology enabling a multi-core SoC implementation of a first-generation CELL processor

This paper reviews the design challenges that current and future processors must face, with stringent power limits and high frequency targets, and the design methods required to overcome the above challenges and address the continuing Giga-scale system integration trend. This paper then describes the details behind the design methodology that was used to successfully implement a first-generation CELL processor - a multi-core SoC. Key features of this methodology are broad optimization with fast rule-based analysis engines using macro-level abstraction for constraints propagation up/down the design hierarchy, coupled with accurate transistor level simulation for detailed analysis. The methodology fostered the modular design concept that is inherent to the CELL architecture, enabling a high frequency design by maximizing custom circuit content through re-use, and balanced power, frequency, and die size targets through global convergence capabilities. The design has roughly 241 million transistors implemented in 90 nm SOI technology with 8 levels of copper interconnects and one local interconnect layer. The chip has been tested at various temperatures, voltages, and frequencies. Correct operation has been observed in the lab on first pass silicon at frequencies well over 4GHz.

[1]  Lawrence T. Pileggi,et al.  RICE: rapid interconnect circuit evaluation using AWE , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[2]  Ching-Te Chuang,et al.  Circuit design techniques for the high-performance CMOS IBM S/390 Parallel Enterprise Server G4 microprocessor , 1997, IBM J. Res. Dev..

[3]  M. Horowitz,et al.  Clocking and circuit design for a parallel I/O on a first-generation CELL processor , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[4]  Karsten P. Ulland,et al.  Vii. References , 2022 .

[5]  B. Flachs,et al.  A streaming processing unit for a CELL processor , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[6]  Masaru Ishizuka,et al.  Thermal Modeling with Transfer Function for the Transient Chip-On-Substrate Problem , 2005 .

[7]  Lawrence T. Pileggi,et al.  Asymptotic waveform evaluation for timing analysis , 1990, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[8]  Ashutosh Das,et al.  A new family of semidynamic and dynamic flip-flops with embedded logic for high-performance processors , 1999 .

[9]  Kevin J. Nowka,et al.  "Timing closure by design," a high frequency microprocessor design methodology , 2000, Proceedings 37th Design Automation Conference.

[10]  S. Asano,et al.  The design and implementation of a first-generation CELL processor , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[11]  S.H. Dhong,et al.  A 4.8GHz fully pipelined embedded SRAM in the streaming processor of a CELL processor , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[12]  Lawrence T. Pileggi,et al.  RICE: rapid interconnect circuit evaluator , 1991, 28th ACM/IEEE Design Automation Conference.

[13]  M. Suzuoki,et al.  Overview of the architecture, circuit design, and physical implementation of a first-generation cell processor , 2006, IEEE Journal of Solid-State Circuits.

[14]  K.A. Jenkins,et al.  The clock distribution of the Power4 microprocessor , 2002, 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315).

[15]  K.A. Jenkins,et al.  A clock distribution network for microprocessors , 2000, 2000 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.00CH37103).

[16]  Ronald A. Rohrer,et al.  Adaptively controlled explicit simulation , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..