NoΔ: Leveraging delta compression for end-to-end memory access in NoC based multicores

As the number of on-chip processing elements increases, the interconnection backbone bears bursty traffic from memory and cache accesses. In this paper, we propose a compression technique called NoΔ, which leverages delta compression to compress network traffic. Specifically, it conducts data encoding prior to packet injection and decoding before ejection in the network interface. The key idea of NoΔ is to store a data packet in the Network-on-Chip as a common base value plus an array of relative differences (Δ). It can improve the overall network performance and achieve energy savings because of the decreased network load. Moreover, this scheme does not require modifications of the cache storage design and can be seamlessly integrated with any optimization techniques for the on-chip interconnect. Our experiments reveal that the proposed NoΔ incurs negligible hardware overhead and outperforms state-of-the-art zero-content compression and frequent-value compression.

[1]  Ki Hwan Yum,et al.  Adaptive data compression for high-performance low-power on-chip networks , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[2]  Niraj K. Jha,et al.  GARNET: A detailed on-chip network model inside a full-system simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[3]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[4]  Onur Mutlu,et al.  Base-delta-immediate compression: Practical data compression for on-chip caches , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[5]  Jun Yang,et al.  Frequent value compression in packet-based NoC architectures , 2009, 2009 Asia and South Pacific Design Automation Conference.

[6]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[7]  Jun Yang,et al.  Frequent Value Locality and Value-Centric Data Cache Design , 2000, ASPLOS.

[8]  Chen Sun,et al.  DSENT - A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.

[9]  Chita R. Das,et al.  A novel dimensionally-decomposed router for on-chip communication in 3D architectures , 2007, ISCA '07.

[10]  Jaehyuk Huh,et al.  A NUCA Substrate for Flexible CMP Cache Sharing , 2007, IEEE Transactions on Parallel and Distributed Systems.

[11]  W. Dally,et al.  Route packets, not wires: on-chip interconnection networks , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[12]  Onur Mutlu,et al.  Express Cube Topologies for on-Chip Interconnects , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[13]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[14]  Chita R. Das,et al.  Performance and power optimization through data compression in Network-on-Chip architectures , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[15]  Natalie D. Enright Jerger,et al.  Whole packet forwarding: Efficient design of fully adaptive routing algorithms for networks-on-chip , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[16]  David A. Wood,et al.  Managing Wire Delay in Large Chip-Multiprocessor Caches , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[17]  André Seznec,et al.  Zero-content augmented caches , 2009, ICS '09.

[18]  David A. Wood,et al.  Adaptive cache compression for high-performance processors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[19]  Jun Yang,et al.  Frequent value compression in data caches , 2000, Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000.