Wave-Pipelining the Global Interconnect to Reduce the Associated Delays

The majority of digital circuits/systems primarily use synchronous clocking methodology. With clock distribution networks dissipating ever more power and the wire delays expected to become dominant, there has been increased activity to provide alternative solutions. This paper explores some potential methods for reducing global interconnect delays and improving throughput between communicating modules. Analysis of the classical repeater insertion is performed and a wave-pipelined repeater insertion scheme that addresses some shortfalls of the classical repeater insertion is proposed. An extension of the wave-pipelined repeater insertion scheme is presented and results show that its data retention capability offers reliable communication between any number of computing elements. The design of the communication channel is based on the assumption that the computing elements employ synchronous clocking while the communication channels are driven by locally generated clocks. Locally generating clocks along the communication channel avoids the clock distribution complexities and offers an ability to stop and start data transfer along the channel without the need for elaborate clock gating circuitry. Furthermore, no additional clock cycles are required to flush the pipe in the event of stalls. The circuitry that generates local clocks increases area and power, but shows significant performance advantages, particularly in providing a seamless interface between communicating modules running at different clock frequencies. Simulation results of the distributed FIFO communication channel in a modest 180 nm technology show locally generated clocks running at 2.22GHz with the memory buffers placed 2 mm apart.

[1]  Mani B. Srivastava,et al.  A survey of techniques for energy efficient on-chip communication , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[2]  K. Banerjee,et al.  A global interconnect optimization scheme for nanometer scale VLSI with implications for latency, bandwidth, and power dissipation , 2004, IEEE Transactions on Electron Devices.

[3]  Dietmar Müller,et al.  Efficient modeling and synthesis of on-chip communication protocols for network-on-chip design , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[4]  Wentai Liu,et al.  Current-mode signaling in deep submicrometer global interconnects , 2003, IEEE Trans. Very Large Scale Integr. Syst..

[5]  Luca Benini,et al.  Networks on Chips : A New SoC Paradigm , 2022 .

[6]  Kaustav Banerjee,et al.  A power-optimal repeater insertion methodology for global interconnects in nanometer designs , 2002 .

[7]  Jeffrey A. Davis,et al.  Voltage scaling and repeater insertion for high-throughput low-power interconnects , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[8]  W. Liu,et al.  Wave-pipelining: a tutorial and research survey , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[9]  Alberto L. Sangiovanni-Vincentelli,et al.  Coping with Latency in SOC Design , 2002, IEEE Micro.

[10]  Ivan E. Sutherland,et al.  GasP: a minimal FIFO control , 2001, Proceedings Seventh International Symposium on Asynchronous Circuits and Systems. ASYNC 2001.

[11]  Mario R. Casu,et al.  On-chip transparent wire pipelining , 2004, IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings..

[12]  Himanshu Kaul,et al.  Low-power on-chip communication based on transition-aware global signaling (TAGS) , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.