Routing of asynchronous Clos networks

Clos networks provide theoretically optimal solution to build high-radix switches. Dynamically reconfiguring a three-stage Clos network is more difficult in asynchronous circuits than in synchronous circuits. This study proposes a novel asynchronous dispatching (AD) algorithm for general three-stage Clos networks. It is compared with the classic synchronous concurrent round-robin dispatching (CRRD) algorithm in unbuffered Clos networks. The AD algorithm avoids the contention in central modules using a state feedback scheme and outperforms the throughput of CRRD in behavioural simulations. Two asynchronous Clos networks using the AD algorithm are implemented and compared with a synchronous Clos network using the CRRD algorithm. The asynchronous Clos scheduler is smaller than its synchronous counterpart. Synchronous Clos networks achieve higher throughput than asynchronous Clos networks because asynchronous Clos networks cannot hide the arbitration latency and their data paths are slow. The asynchronous Clos scheduler consumes significantly lower power than the synchronous scheduler and the asynchronous Clos network using bundled-data data switches shows the best power efficiency in all implementations.

[1]  H. Jonathan Chao,et al.  PetaStar: a petabit photonic packet switch , 2003, IEEE J. Sel. Areas Commun..

[2]  Ivan E. Sutherland,et al.  GasP: a minimal FIFO control , 2001, Proceedings Seventh International Symposium on Asynchronous Circuits and Systems. ASYNC 2001.

[3]  Ran Ginosar,et al.  QNoC asynchronous router , 2009, Integr..

[4]  William J. Dally,et al.  Microarchitecture of a high radix router , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[5]  Daniel Gooch,et al.  Communications of the ACM , 2011, XRDS.

[6]  William J. Dally,et al.  The BlackWidow High-Radix Clos Network , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[7]  Doug A. Edwards,et al.  Asynchronous spatial division multiplexing router , 2011, Microprocess. Microsystems.

[8]  Doug A. Edwards,et al.  A low latency wormhole router for asynchronous on-chip networks , 2010, 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC).

[9]  William J. Dally Virtual-channel flow control , 1990, ISCA '90.

[10]  Stephen B. Furber,et al.  Chain: A Delay-Insensitive Chip Area Interconnect , 2002, IEEE Micro.

[11]  Thomas E. Anderson,et al.  High speed switch scheduling for local area networks , 1992, ASPLOS V.

[12]  Alexandre Yakovlev,et al.  Modular Approach to Multi-resource Arbiter Design , 2009, 2009 15th IEEE Symposium on Asynchronous Circuits and Systems.

[13]  Steve Furber,et al.  Principles of Asynchronous Circuit Design: A Systems Perspective , 2010 .

[14]  D. J. Kinniment Synchronization and Arbitration in Digital Systems , 2008 .

[15]  F. M. Chiussi,et al.  Low-cost scalable switching solutions for broadband networking: the ATLANTA architecture and chipset , 1997 .

[16]  Mark B. Josephs,et al.  CMOS design of the tree arbiter element , 1996, IEEE Trans. Very Large Scale Integr. Syst..

[17]  Samuel P. Morgan,et al.  Input Versus Output Queueing on a Space-Division Packet Switch , 1987, IEEE Trans. Commun..

[18]  Jens Sparsø,et al.  Principles of Asynchronous Circuit Design , 2001 .

[19]  Eiji Oki,et al.  Analysis of Space-Space-Space Clos-Network Packet Switch , 2009, 2009 Proceedings of 18th International Conference on Computer Communications and Networks.

[20]  Alexandre Yakovlev,et al.  The Magic Rule of Tiles: Virtual Delay Insensitivity , 2009, PATMOS.

[21]  Didier Colle,et al.  Clos lives on in optical packet switching , 2004, IEEE Communications Magazine.

[22]  Eiji Oki,et al.  Broadband Packet Switching Technologies , 2001 .

[23]  Steven M. Nowick,et al.  The Design of High-Performance Dynamic Asynchronous Pipelines: Lookahead Style , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[24]  Nick McKeown,et al.  A Starvation-free Algorithm For Achieving 100% Throughput in an Input- Queued Switch , 1999 .

[25]  R. Rojas-Cessa,et al.  Scalable two-stage Clos-network switch and module-first matching , 2006, 2006 Workshop on High Performance Switching and Routing.

[26]  Eiji Oki,et al.  Broadband Packet Switching Technologies: A Practical Guide to ATM Switches and IP Routers , 2001 .

[27]  H. Jonathan Chao,et al.  Matching algorithms for three-stage bufferless Clos network switches , 2003, IEEE Commun. Mag..

[28]  Pedro López,et al.  Exploiting Wiring Resources on Interconnection Network: Increasing Path Diversity , 2008, 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008).

[29]  Eckhard Grass,et al.  Globally Asynchronous, Locally Synchronous Circuits: Overview and Outlook , 2007, IEEE Design & Test of Computers.

[30]  Diederik Verkest,et al.  Concepts and Implementation of Spatial Division Multiplexing for Guaranteed Throughput in Networks-on-Chip , 2008, IEEE Transactions on Computers.

[31]  Alexandre Yakovlev,et al.  Priority arbiters , 2000, Proceedings Sixth International Symposium on Advanced Research in Asynchronous Circuits and Systems (ASYNC 2000) (Cat. No. PR00586).

[32]  Doug A. Edwards,et al.  An Asynchronous Routing Algorithm for Clos Networks , 2010, 2010 10th International Conference on Application of Concurrency to System Design.

[33]  Eiji Oki,et al.  Concurrent round-robin-based dispatching schemes for Clos-network switches , 2002, TNET.

[34]  Charles Clos,et al.  A study of non-blocking switching networks , 1953 .