The high-bandwidth 256 kB 2nd level cache on an Itanium microprocessor

This paper describes the second level 256 kB unified cache incorporated into the next generation of the Itanium/spl trade/ processor family code named McKinley. The paper describes the datapath structures that provide a non-blocking, out-of-order interface to the processor core achieving a minimum 5-cycle latency with a standalone bandwidth of 72 GB/s.