Rediscovering Logarithmic Diameter Topologies for Low Latency Network-on-Chip-Based Applications

Low-latency Network-on-Chip (NoC) applications have tight constraints on the clock budget to perform communication among nodes. This is a critical aspect in NoC-based designs where the number of clock cycles spent for communication depends mainly on the topology and on the routing algorithm. This work deals with logarithmic diameter topologies, that were proposed for computer networks, and shows that an optimal shortest-path routing algorithm can be efficiently implemented on this kind of topologies by means of a very simple circuit. The proposed circuit is then exploited to reduce the area and the power consumption of a recently proposed NoC-based design. Experimental results show that the proposed circuit allows for a reduction of about 14% and 10% for area and power consumption respectively, with respect to a shortest-path routing-table-based design.

[1]  Norbert Wehn,et al.  Network-on-chip-centric approach to interleaving in high throughput channel decoders , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[2]  Luca Benini,et al.  Synthesis of Predictable Networks-on-Chip-Based Interconnect Architectures for Chip Multiprocessors , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[3]  Amer Baghdadi,et al.  Binary de Bruijn on-chip network for a flexible multiprocessor LDPC decoder , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[4]  Katherine Shu-Min Li CusNoC: Fast Full-Chip Custom NoC Generation , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[5]  Luca Benini Application Specific NoC Design , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[6]  W. Dally,et al.  Route packets, not wires: on-chip interconnection networks , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[7]  David Bol,et al.  Investigation of the routing algorithm in a De Bruijn-based NoC for low-power applications , 2013, 2013 IEEE Faible Tension Faible Consommation.

[8]  Sam Toueg,et al.  On the impossibility of Directed Moore Graphs , 1980, J. Comb. Theory, Ser. B.

[9]  Luca Benini,et al.  ×pipesCompiler: A Tool for Instantiating Application Specific Networks on Chip , 2004, DATE.

[10]  Liang-Gee Chen,et al.  A 1.0TOPS/W 36-core neocortical computing processor with 2.3Tb/s Kautz NoC for universal visual recognition , 2012, 2012 IEEE International Solid-State Circuits Conference.

[11]  L. Benini,et al.  /spl times/pipesCompiler: a tool for instantiating application specific networks on chip , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[12]  Jianhao Hu,et al.  A novel 3D NoC architecture based on De Bruijn graph , 2012, Comput. Electr. Eng..

[13]  An-Yeu Wu,et al.  Topology-Aware Adaptive Routing for Nonstationary Irregular Mesh in Throttled 3D NoC Systems , 2013, IEEE Transactions on Parallel and Distributed Systems.

[14]  Amer Baghdadi,et al.  On chip interconnects for multiprocessor turbo decoding architectures , 2011, Microprocess. Microsystems.

[15]  Naveen Choudhary,et al.  A Survey of Logic Based Distributed Routing for On-Chip Interconnection Networks , 2013 .

[16]  José Duato,et al.  Logic-Based Distributed Routing for NoCs , 2008, IEEE Computer Architecture Letters.

[17]  Axel Jantsch,et al.  A network on chip architecture and design methodology , 2002, Proceedings IEEE Computer Society Annual Symposium on VLSI. New Paradigms for VLSI Systems Design. ISVLSI 2002.

[18]  Dhiraj K. Pradhan,et al.  Low Latency and Energy Efficient Scalable Architecture for Massive NoCs Using Generalized de Bruijn Graph , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[19]  de Ng Dick Bruijn A combinatorial problem , 1946 .

[20]  Amer Baghdadi,et al.  Flexible Architectures for LDPC Decoders Based on Network on Chip Paradigm , 2009, 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools.

[21]  Federico Silla,et al.  Cost-Efficient On-Chip Routing Implementations for CMP and MPSoC Systems , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[22]  Ding-Zhu Du,et al.  Generalized de Bruijn digraphs , 1988, Networks.

[23]  Kyungsook Y. Lee,et al.  Optimal Routing Algorithms for Generalized de Bruijn Digraphs , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[24]  Makoto Imase,et al.  Design to Minimize Diameter on Building-Block Network , 1981, IEEE Transactions on Computers.

[25]  Massimo Ruo Roch,et al.  MEDEA: a hybrid shared-memory/message-passing multiprocessor NoC-based architecture , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[26]  Guido Masera,et al.  Turbo NOC: A Framework for the Design of Network-on-Chip-Based Turbo Decoder Architectures , 2009, IEEE Transactions on Circuits and Systems I: Regular Papers.

[27]  Joseph R. Cavallaro,et al.  High-throughput Contention-Free concurrent interleaver architecture for multi-standard turbo decoder , 2011, ASAP 2011 - 22nd IEEE International Conference on Application-specific Systems, Architectures and Processors.

[28]  Hannu Tenhunen,et al.  CARS: Congestion-aware request scheduler for network interfaces in NoC-based manycore systems , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[29]  Marco Crepaldi,et al.  A 130-nm CMOS 0.007-$\hbox{mm}^{2}$ Ring-Oscillator-Based Self-Calibrating IR-UWB Transmitter Using an Asynchronous Logic Duty-Cycled PLL , 2013, IEEE Transactions on Circuits and Systems II: Express Briefs.

[30]  Gianluca Piccinini,et al.  UDSM Trends Comparison: From Technology Roadmap to UltraSparc Niagara2 , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[31]  Dhiraj K. Pradhan,et al.  The De Bruijn Multiprocessor Network: A Versatile Parallel Processing and Sorting Network for VLSI , 1989, IEEE Trans. Computers.

[32]  Guido Masera,et al.  VLSI Implementation of a Multi-Mode Turbo/LDPC Decoder Architecture , 2013, IEEE Transactions on Circuits and Systems I: Regular Papers.

[33]  Makoto Imase,et al.  A Design for Directed Graphs with Minimum Diameter , 1983, IEEE Transactions on Computers.

[34]  Luca Fanucci,et al.  Homogeneous and Heterogeneous MPSoC Architectures with Network-On-Chip Connectivity for Low-Power and Real-Time Multimedia Signal Processing , 2012, VLSI Design.

[35]  Alain Greiner,et al.  A generic architecture for on-chip packet-switched interconnections , 2000, DATE '00.

[36]  Michael C. Huang,et al.  Low latency and energy efficient multicasting schemes for 3D NoC-based SoCs , 2011, 2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip.

[37]  Luca Fanucci,et al.  Motion estimation and CABAC VLSI co-processors for real-time high-quality H.264/AVC video coding , 2010, Microprocess. Microsystems.

[38]  Luca Benini,et al.  Networks on Chips : A New SoC Paradigm , 2022 .

[39]  Yi Wang,et al.  SD-MAC: Design and Synthesis of a Hardware-Efficient Collision-Free QoS-Aware MAC Protocol for Wireless Network-on-Chip , 2008, IEEE Transactions on Computers.

[40]  Fabien Clermidy,et al.  Reconfiguration of a 3GPP-LTE telecommunication application on a 22-core NoC-based system-on-chip , 2011, Proceedings of the Fifth ACM/IEEE International Symposium.