A Multicast Tree Router for Multichip Neuromorphic Systems

We present a tree router for multichip systems that guarantees deadlock-free multicast packet routing without dropping packets or restricting their length. Multicast routing is required to efficiently connect massively parallel systems' computational units when each unit is connected to thousands of others residing on multiple chips, which is the case in neuromorphic systems. Our tree router implements this one-to-many routing by branching recursively-broadcasting the packet within a specified subtree. Within this subtree, the packet is only accepted by chips that have been programmed to do so. This approach boosts throughput because memory look-ups are avoided enroute, and keeps the header compact because it only specifies the route to the subtree's root. Deadlock is avoided by routing in two phases-an upward phase and a downward phase-and by restricting branching to the downward phase. This design is the first fully implemented wormhole router with packet-branching that can never deadlock. The design's effectiveness is demonstrated in Neurogrid, a million-neuron neuromorphic system consisting of sixteen chips. Each chip has a 256 × 256 silicon-neuron array integrated with a full-custom asynchronous VLSI implementation of the router that delivers up to 1.17 G words/s across the sixteen-chip network with less than 1 μs jitter.

[1]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[2]  David A. Patterson,et al.  X-Tree: A tree structured multi-processor computer architecture , 1978, ISCA '78.

[3]  S. Joshi,et al.  Scalable event routing in hierarchical neural array architecture with global synaptic connectivity , 2010, 2010 12th International Workshop on Cellular Nanoscale Networks and their Applications (CNNA 2010).

[4]  Kwabena Boahen,et al.  A burst-mode word-serial address-event link-I: transmitter design , 2004, IEEE Transactions on Circuits and Systems I: Regular Papers.

[5]  Kwabena Boahen,et al.  A Delay-Insensitive Address-Event Link , 2009, 2009 15th IEEE Symposium on Asynchronous Circuits and Systems.

[6]  Jongkil Park,et al.  Live demonstration: Hierarchical Address-Event Routing architecture for reconfigurable large scale neuromorphic systems , 2012, 2012 IEEE International Symposium on Circuits and Systems.

[7]  T. Delbruck,et al.  > Replace This Line with Your Paper Identification Number (double-click Here to Edit) < 1 , 2022 .

[8]  Misha A. Mahowald,et al.  An Analog VLSI System for Stereoscopic Vision , 1994 .

[9]  K. Boahen,et al.  Programmable Connections in Neuromorphic Grids , 2006, 2006 49th IEEE International Midwest Symposium on Circuits and Systems.

[10]  Kwabena Boahen,et al.  Optic nerve signals in a neuromorphic chip I: Outer and inner retina models , 2004, IEEE Transactions on Biomedical Engineering.

[11]  Alain J. Martin Programming in VLSI: from communicating processes to delay-insensitive circuits , 1991 .

[12]  David E. Schimmel,et al.  An asynchronous architecture for modeling intersegmental neural communication , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[13]  Steve B. Furber,et al.  Understanding the interconnection network of SpiNNaker , 2009, ICS.

[14]  Kwabena Boahen,et al.  A 1-change-in-4 delay-insensitive interchip link , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[15]  Alan F. Murray,et al.  Large Developing Receptive Fields Using a Distributed and Locally Reprogrammable Address–Event Receiver , 2010, IEEE Transactions on Neural Networks.

[16]  Kathie L. Olsen,et al.  Neurotech for Neuroscience: Unifying Concepts, Organizing Principles, and Emerging Tools , 2007, The Journal of Neuroscience.

[17]  Dharmendra S. Modha,et al.  A digital neurosynaptic core using embedded crossbar memory with 45pJ per spike in 45nm , 2011, 2011 IEEE Custom Integrated Circuits Conference (CICC).

[18]  S. Wittevrongel,et al.  Queueing Systems , 2019, Introduction to Stochastic Processes and Simulation.

[19]  Sally Anne Browning,et al.  The tree machine: a highly concurrent computing environment , 1980 .

[20]  Kwabena Boahen,et al.  A superposable silicon synapse with programmable reversal potential , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[21]  Bo Wen,et al.  A Silicon Cochlea With Active Coupling , 2009, IEEE Transactions on Biomedical Circuits and Systems.

[22]  Massimo A. Sivilotti,et al.  Wiring considerations in analog VLSI systems, with application to field-programmable networks , 1992 .

[23]  Bertram E. Shi,et al.  Expandable Networks for Neuromorphic Chips , 2007, IEEE Transactions on Circuits and Systems I: Regular Papers.

[24]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[25]  Kwabena Boahen,et al.  Silicon Neurons That Compute , 2012, ICANN.

[26]  Manfred Glesner,et al.  Adaptive and Deadlock-Free Tree-Based Multicast Routing for Networks-on-Chip , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[27]  Tobi Delbrück,et al.  CAVIAR: A 45k Neuron, 5M Synapse, 12G Connects/s AER Hardware Sensory–Processing– Learning–Actuating System for High-Speed Visual Object Recognition and Tracking , 2009, IEEE Transactions on Neural Networks.

[28]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .

[29]  Andrew M Lines,et al.  Pipelined Asynchronous Circuits , 1998 .

[30]  J.V. Arthur,et al.  Recurrently connected silicon neurons with active dendrites for one-shot learning , 2004 .

[31]  Ellis Horowitz,et al.  The Binary Tree as an Interconnection Network: Applications to Multiprocessor Systems and VLSI , 1981, IEEE Transactions on Computers.

[32]  Cauligi S. Raghavendra,et al.  On multicast wormhole routing in multicomputer networks , 1994, Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing.

[33]  Kwabena Boahen,et al.  Translinear circuits in subthreshold MOS , 1996 .

[34]  Kwabena Boahen A burst-mode word-serial address-event link-II: receiver design , 2004, IEEE Transactions on Circuits and Systems I: Regular Papers.

[35]  Stephan Hartmann,et al.  VLSI Implementation of a 2.8 Gevent/s Packet-Based AER Interface with Routing and Event Sorting Functionality , 2011, Front. Neurosci..

[36]  Alain J. Martin,et al.  Asynchronous Techniques for System-on-Chip Design , 2006, Proceedings of the IEEE.

[37]  Bertram E. Shi,et al.  Neuromorphic implementation of orientation hypercolumns , 2005, IEEE Transactions on Circuits and Systems I: Regular Papers.

[38]  Bernabé Linares-Barranco,et al.  On Real-Time AER 2-D Convolutions Hardware for Neuromorphic Spike-Based Cortical Processing , 2008, IEEE Transactions on Neural Networks.

[39]  Kwabena Boahen,et al.  Dynamical System Guided Mapping of Quantitative Neuronal Models Onto Neuromorphic Hardware , 2012, IEEE Transactions on Circuits and Systems I: Regular Papers.

[40]  Jim D. Garside,et al.  Overview of the SpiNNaker System Architecture , 2013, IEEE Transactions on Computers.

[41]  Carver Mead,et al.  Analog VLSI and neural systems , 1989 .

[42]  Bernabé Linares-Barranco,et al.  Multicasting Mesh AER: A Scalable Assembly Approach for Reconfigurable Neuromorphic Structured AER Systems. Application to ConvNets , 2013, IEEE Transactions on Biomedical Circuits and Systems.

[43]  Olivier Temam,et al.  Hardware spiking neurons design: Analog or digital? , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[44]  Tobi Delbrück,et al.  A 128$\times$ 128 120 dB 15 $\mu$s Latency Asynchronous Temporal Contrast Vision Sensor , 2008, IEEE Journal of Solid-State Circuits.

[45]  A. Cassidy,et al.  Dynamical digital silicon neurons , 2008, 2008 IEEE Biomedical Circuits and Systems Conference.