Exploiting generalized de-Bruijn/Kautz topologies for flexible iterative channel code decoder architectures

Modern iterative channel code decoder architectures have tight constrains on the throughput but require flexibility to support different modes and standards. Unfortunately, flexibility often comes at the expense of increasing the number of clock cycles required to complete the decoding of a data-frame, thus reducing the sustained throughput. The Network-on-Chip (NoC) paradigm is an interesting option to achieve flexibility, but several design choices, including the topology and the routing algorithm, can affect the decoder throughput. In this work logarithmic diameter topologies, in particular generalized de-Bruijn and Kautz topologies, are addressed as possible solutions to achieve both flexible and high throughput architectures for iterative channel code decoding. In particular, this work shows that the optimal shortest-path routing algorithm for these topologies, that is still available in the open literature, can be efficiently implemented resorting to a very simple circuit. Experimental results show that the proposed architecture features a reduction of about 14% and 10% for area and power consumption respectively, with respect to a previous shortest-path routing-table-based design.

[1]  Luca Fanucci,et al.  Motion estimation and CABAC VLSI co-processors for real-time high-quality H.264/AVC video coding , 2010, Microprocess. Microsystems.

[2]  William J. Dally,et al.  Route packets, not wires: on-chip inteconnection networks , 2001, DAC '01.

[3]  Amer Baghdadi,et al.  Flexible Architectures for LDPC Decoders Based on Network on Chip Paradigm , 2009, 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools.

[4]  Federico Silla,et al.  Cost-Efficient On-Chip Routing Implementations for CMP and MPSoC Systems , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[5]  Guido Masera,et al.  VLSI Implementation of a Multi-Mode Turbo/LDPC Decoder Architecture , 2013, IEEE Transactions on Circuits and Systems I: Regular Papers.

[6]  Gianluca Piccinini,et al.  UDSM Trends Comparison: From Technology Roadmap to UltraSparc Niagara2 , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[7]  Luca Benini,et al.  ×pipesCompiler: A Tool for Instantiating Application Specific Networks on Chip , 2004, DATE.

[8]  Makoto Imase,et al.  A Design for Directed Graphs with Minimum Diameter , 1983, IEEE Transactions on Computers.

[9]  Jianhao Hu,et al.  A novel 3D NoC architecture based on De Bruijn graph , 2012, Comput. Electr. Eng..

[10]  Liang-Gee Chen,et al.  A 1.0TOPS/W 36-core neocortical computing processor with 2.3Tb/s Kautz NoC for universal visual recognition , 2012, 2012 IEEE International Solid-State Circuits Conference.

[11]  Norbert Wehn,et al.  Network-on-chip-centric approach to interleaving in high throughput channel decoders , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[12]  Luca Benini,et al.  Synthesis of Predictable Networks-on-Chip-Based Interconnect Architectures for Chip Multiprocessors , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[13]  Makoto Imase,et al.  Design to Minimize Diameter on Building-Block Network , 1981, IEEE Transactions on Computers.

[14]  Guido Masera,et al.  Turbo NOC: A Framework for the Design of Network-on-Chip-Based Turbo Decoder Architectures , 2009, IEEE Transactions on Circuits and Systems I: Regular Papers.

[15]  Axel Jantsch,et al.  A network on chip architecture and design methodology , 2002, Proceedings IEEE Computer Society Annual Symposium on VLSI. New Paradigms for VLSI Systems Design. ISVLSI 2002.

[16]  Reza Sabbaghi-Nadooshan,et al.  The 2D digraph-based NoCs: attractive alternatives to the 2D mesh NoCs , 2010, The Journal of Supercomputing.

[17]  Naveen Choudhary,et al.  A Survey of Logic Based Distributed Routing for On-Chip Interconnection Networks , 2013 .

[18]  José Duato,et al.  Logic-Based Distributed Routing for NoCs , 2008, IEEE Computer Architecture Letters.

[19]  Fabien Clermidy,et al.  Reconfiguration of a 3GPP-LTE telecommunication application on a 22-core NoC-based system-on-chip , 2011, Proceedings of the Fifth ACM/IEEE International Symposium.

[20]  Dhiraj K. Pradhan,et al.  Low Latency and Energy Efficient Scalable Architecture for Massive NoCs Using Generalized de Bruijn Graph , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[21]  L. Benini,et al.  /spl times/pipesCompiler: a tool for instantiating application specific networks on chip , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[22]  Amer Baghdadi,et al.  From Parallelism Levels to a Multi-ASIP Architecture for Turbo Decoding , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[23]  Norbert Wehn,et al.  A Reconfigurable ASIP for Convolutional and Turbo Decoding in an SDR Environment , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[24]  Ding-Zhu Du,et al.  Generalized de Bruijn digraphs , 1988, Networks.

[25]  Luca Benini,et al.  Networks on Chips : A New SoC Paradigm , 2022 .

[26]  Yi Wang,et al.  SD-MAC: Design and Synthesis of a Hardware-Efficient Collision-Free QoS-Aware MAC Protocol for Wireless Network-on-Chip , 2008, IEEE Transactions on Computers.

[27]  Dhiraj K. Pradhan,et al.  The De Bruijn Multiprocessor Network: A Versatile Parallel Processing and Sorting Network for VLSI , 1989, IEEE Trans. Computers.

[28]  Amer Baghdadi,et al.  On chip interconnects for multiprocessor turbo decoding architectures , 2011, Microprocess. Microsystems.

[29]  Massimo Ruo Roch,et al.  Rediscovering Logarithmic Diameter Topologies for Low Latency Network-on-Chip-Based Applications , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[30]  Sam Toueg,et al.  On the impossibility of Directed Moore Graphs , 1980, J. Comb. Theory, Ser. B.

[31]  Massimo Ruo Roch,et al.  MEDEA: a hybrid shared-memory/message-passing multiprocessor NoC-based architecture , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[32]  David Bol,et al.  Investigation of the routing algorithm in a De Bruijn-based NoC for low-power applications , 2013, 2013 IEEE Faible Tension Faible Consommation.

[33]  Christian Bernard,et al.  A 477mW NoC-based digital baseband for MIMO 4G SDR , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[34]  Hannu Tenhunen,et al.  CARS: Congestion-aware request scheduler for network interfaces in NoC-based manycore systems , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[35]  Marco Crepaldi,et al.  A 130-nm CMOS 0.007-$\hbox{mm}^{2}$ Ring-Oscillator-Based Self-Calibrating IR-UWB Transmitter Using an Asynchronous Logic Duty-Cycled PLL , 2013, IEEE Transactions on Circuits and Systems II: Express Briefs.

[36]  Alain Greiner,et al.  A generic architecture for on-chip packet-switched interconnections , 2000, DATE '00.

[37]  Michael C. Huang,et al.  Low latency and energy efficient multicasting schemes for 3D NoC-based SoCs , 2011, 2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip.

[38]  An-Yeu Wu,et al.  Topology-Aware Adaptive Routing for Nonstationary Irregular Mesh in Throttled 3D NoC Systems , 2013, IEEE Transactions on Parallel and Distributed Systems.

[39]  de Ng Dick Bruijn A combinatorial problem , 1946 .

[40]  Kyungsook Y. Lee,et al.  Optimal Routing Algorithms for Generalized de Bruijn Digraphs , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[41]  Wei Zhang,et al.  A NoC Traffic Suite Based on Real Applications , 2011, 2011 IEEE Computer Society Annual Symposium on VLSI.

[42]  José Duato,et al.  An Efficient Implementation of Distributed Routing Algorithms for NoCs , 2008, Second ACM/IEEE International Symposium on Networks-on-Chip (nocs 2008).

[43]  Katherine Shu-Min Li CusNoC: Fast Full-Chip Custom NoC Generation , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[44]  Amer Baghdadi,et al.  Binary de Bruijn on-chip network for a flexible multiprocessor LDPC decoder , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[45]  Luca Benini Application Specific NoC Design , 2006, Proceedings of the Design Automation & Test in Europe Conference.