A Simple On-Chip Optical Interconnection for Improving Performance of Coherency Traffic in CMPs

Nanophotonic interconnection is a promising solution for inter-core communication in future chip multiprocessors (CMPs). Main benefits derive from its intrinsic low-latency and high-bandwidth, especially when employing wavelength division multiplexing (WDM), as well as reduced power requirements when compared to electronic NoCs. Existing works on optical NoCs (ONoC) mainly concentrate on relatively complex proposals needed to host the whole CMP traffic. In some proposals complexity is increased also from the need of an electronic network for preliminary pathsetup in the optical one. This paper proposes to enhance a conventional NoC with only a simple photonic structure, a ring, and aims at investigating its suitability to support the low-latency transmission of small latency-critical coherency control messages as to improve performance of multithreaded applications. In particular, our proposed scheme supports fast multicast transmission of invalidation messages. We have simulated Parsec benchmarks on an 8 core full-system CMP. Results show that a careful selection of coherency control messages to be forwarded to the photonic ring allows improving execution time up to 19%, with an average of 6% across all considered benchmarks. We discuss how different selections of messages, i.e. related to read and/or write operations, affect results and single out the most profitable set. Moreover, we show that the sharing behavior of benchmarks has a central role in the final performance.

[1]  Ronald G. Dreslinski,et al.  The M5 Simulator: Modeling Networked Systems , 2006, IEEE Micro.

[2]  Yu Zhang,et al.  Firefly: illuminating future network-on-chip with nanophotonics , 2009, ISCA '09.

[3]  Ayse Yasemin Seydim Wormhole Routing in Parallel Computers , 2001 .

[4]  Doug Burger,et al.  An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.

[5]  Alireza Shafaei,et al.  An Optical Wavelength Switching Architecture for a High-Performance Low-Power Photonic NoC , 2011, 2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications.

[6]  Sudeep Pasricha,et al.  OPAL: A multi-layer hybrid photonic NoC for 3D ICs , 2011, 16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011).

[7]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[8]  Pierfrancesco Foglia,et al.  Feedback-Driven Restructuring of Multi-threaded Applications for NUCA Cache Performance in CMPs , 2010, 2010 22nd International Symposium on Computer Architecture and High Performance Computing.

[9]  Frédéric Gaffiot,et al.  On-Chip Optical Interconnect for Low-Power , 2004, Ultra Low-Power Electronics and Design.

[10]  Norman P. Jouppi,et al.  CACTI 6.0: A Tool to Model Large Caches , 2009 .

[11]  Jung Ho Ahn,et al.  Corona: System Implications of Emerging Nanophotonic Technology , 2008, 2008 International Symposium on Computer Architecture.

[12]  Simon W. Moore,et al.  A communication characterisation of Splash-2 and Parsec , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[13]  Alberto Ros,et al.  Scalable Directory Organization for Tiled CMP Architectures , 2008, CDES.

[14]  Luca P. Carloni,et al.  Design Exploration of Optical Interconnection Networks for Chip Multiprocessors , 2008, 2008 16th IEEE Symposium on High Performance Interconnects.

[15]  Sudeep Pasricha,et al.  Exploring hybrid photonic networks-on-chip foremerging chip multiprocessors , 2009, CODES+ISSS '09.

[16]  F.J. Leonberger,et al.  Optical interconnections for VLSI systems , 1984, Proceedings of the IEEE.

[17]  T. J. Watson,et al.  Fuss , Futexes and Furwocks : Fast Userlevel Locking in Linux Hubertus Franke IBM , 2005 .

[18]  Luca P. Carloni,et al.  Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors , 2008, IEEE Transactions on Computers.

[19]  Jie Wu,et al.  A high-performance low-power nanophotonic on-chip network , 2009, ISLPED.

[20]  Natalie D. Enright Jerger,et al.  Virtual Circuit Tree Multicasting: A Case for On-Chip Hardware Multicast Support , 2008, 2008 International Symposium on Computer Architecture.

[21]  Gabriel H. Loh,et al.  3D-Stacked Memory Architectures for Multi-core Processors , 2008, 2008 International Symposium on Computer Architecture.