Design of Cost-Efficient Interconnect Processing Units

Written by leading experts in the field, Design of Cost-Efficient Interconnect Processing Units: Spidergon STNoC comprehensively examines the current state-of-the-art and future trends in multiprocessor system-on-chip (MPSoC), in particular network-on-chip (NoC) design. Incorporating simple methods with easy-to-understand examples, this book considers a wealth of important theoretical and practical topics, such as technological deep sub-micron effects, generic NoC components, topological properties, embeddings of common communication patterns, and system-level design. A complementary CD-ROM features a practical NoC training approach based on the award-winning OCCN environment.

[1]  Jens Sparsø,et al.  The MANGO clockless network-on-chip: Concepts and implementation , 2006 .

[2]  Partha Pratim Pande,et al.  Design of a switch for network on chip applications , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[3]  Donald Yeung,et al.  The MIT Alewife Machine , 1999, Proc. IEEE.

[4]  Tom Verhoeff,et al.  Delay-insensitive codes — an overview , 1988, Distributed Computing.

[5]  Israel Koren,et al.  STATS: A framework for microprocessor and system-level design space exploration , 1999, J. Syst. Archit..

[6]  Janez Zerovnik,et al.  Permutation routing in double-loop networks: design and empirical evaluation , 2003, J. Syst. Archit..

[7]  Eike Best,et al.  Semantics of sequential and parallel programs , 1996, Prentice Hall International series in computer science.

[8]  Sujit Dey,et al.  System-level performance analysis for designing on-chipcommunication architectures , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[9]  Cauligi S. Raghavendra,et al.  Minimal Full-Access Networks: Enumeration and Characterization , 1990, J. Parallel Distributed Comput..

[10]  Mark R. Greenstreet Implementing a STARI chip , 1995, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors.

[11]  Sudeep Pasricha Transaction level modeling of SoC with SystemC 2.0 , 2004 .

[12]  Colin J. Fidge,et al.  Partial orders for parallel debugging , 1988, PADD '88.

[13]  Sri Parameswaran,et al.  NoCGEN:a template based reuse methodology for Networks On Chip architecture , 2004, 17th International Conference on VLSI Design. Proceedings..

[14]  V.D. Zivkovic,et al.  Design space exploration of streaming multiprocessor architectures , 2002, IEEE Workshop on Signal Processing Systems.

[15]  Jan M. Rabaey,et al.  Limitations and challenges of computer-aided design technology for CMOS VLSI , 2001, Proc. IEEE.

[16]  Nuria Pazos,et al.  Mapping and scheduling for architecture exploration of networking SoCs , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[17]  Vivek Garg,et al.  Synchronous pipelined relay stations with back-pressure tolerance , 2005, Fifth International Workshop on System-on-Chip for Real-Time Applications (IWSOC'05).

[18]  Andrzej J. Strojwas,et al.  Design methodology for IC manufacturability based on regular logic-bricks , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[19]  Ali Poursepanj,et al.  The PowerPC performance modeling methodology , 1994, CACM.

[20]  Mark D. Hill,et al.  Multiprocessors Should Support Simple Memory-Consistency Models , 1998, Computer.

[21]  Miltos D. Grammatikakis,et al.  OCCN: a NoC modeling framework for design exploration , 2004, J. Syst. Archit..

[22]  T. Felicijan,et al.  An asynchronous low latency arbiter for Quality of Service (QoS) applications , 2003, Proceedings of the 12th IEEE International Conference on Fuzzy Systems (Cat. No.03CH37442).

[23]  Sujit Dey,et al.  On-chip communication architecture for OC-768 network processors , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[24]  F. Schirrmeister,et al.  Methodology and technology for virtual component driven hardware/software co-design on the system-level , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[25]  Seung-Woo Seo,et al.  A New Routing Scheme for Concatenating Two Omega Networks , 1994, PARLE.

[26]  Dirk Herrmann,et al.  The Cosyma System , 1997 .

[27]  Tejas Jhaveri,et al.  Maximization of layout printability/manufacturability by extreme layout regularity , 2006, SPIE Advanced Lithography.

[28]  William J. Dally,et al.  Route packets, not wires: on-chip inteconnection networks , 2001, DAC '01.

[29]  Hendrikus J. M. Veendrick,et al.  Short-circuit dissipation of static CMOS circuitry and its impact on the design of buffer circuits , 1984 .

[30]  Dean M. Tullsen,et al.  Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[31]  Stephen B. Furber,et al.  An asynchronous on-chip network router with quality-of-service (QoS) support , 2004, IEEE International SOC Conference, 2004. Proceedings..

[32]  Maurice Herlihy,et al.  Software transactional memory for dynamic-sized data structures , 2003, PODC '03.

[33]  Bruce M. Maggs,et al.  On-line algorithms for path selection in a nonblocking network , 1990, STOC '90.

[34]  Andrew B. Kahng Design challenges at 65nm and beyond , 2007 .

[35]  Jaroslav Opatrny,et al.  Broadcasting and Spanning Trees in de Bruijn and Kautz Networks , 1992, Discret. Appl. Math..

[36]  Leslie Lamport,et al.  Concurrent reading and writing , 1977, Commun. ACM.

[37]  Stephen B. Furber,et al.  Chain: A Delay-Insensitive Chip Area Interconnect , 2002, IEEE Micro.

[38]  H. T. Kung,et al.  Supporting systolic and memory communication in iWarp , 1990, ISCA '90.

[39]  David Garlan,et al.  A formal basis for architectural connection , 1997, TSEM.

[40]  Atila Alvandpour,et al.  A new mesochronous clocking scheme for synchronization in SoC , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[41]  Srinivasan Murali,et al.  Bandwidth-constrained mapping of cores onto NoC architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[42]  David Harel,et al.  Statecharts: A Visual Formalism for Complex Systems , 1987, Sci. Comput. Program..

[43]  Alberto Leon-Garcia,et al.  Communication Networks , 2000 .

[44]  Frank K. Hwang,et al.  A survey on multi-loop networks , 2003, Theor. Comput. Sci..

[45]  Kunle Olukotun,et al.  The case for a single-chip multiprocessor , 1996, ASPLOS VII.

[46]  Wolfgang E. Denzel,et al.  Analysis of packet switches with input and output queuing , 1993, IEEE Trans. Commun..

[47]  Dongkun Shin,et al.  Intra-task voltage scheduling on DVS-enabled hard real-time systems , 2005, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[48]  Hideyuki Tokuda,et al.  Real-Time Mach: Towards a Predictable Real-Time System , 1990, USENIX MACH Symposium.

[49]  Richard N. Taylor,et al.  A framework for classifying and comparing architecture description languages , 1997, ESEC '97/FSE-5.

[50]  Edmund Y. Lam,et al.  Performance optimization for gridded-layout standard cells , 2004, SPIE Photomask Technology.

[51]  Maurice Herlihy,et al.  Obstruction-free synchronization: double-ended queues as an example , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[52]  Kees Goossens,et al.  A Router Architecture for Networks on Silicon , 2001 .

[53]  Jens Sparsø,et al.  A router architecture for connection-oriented service guarantees in the MANGO clockless network-on-chip , 2005, Design, Automation and Test in Europe.

[54]  Altamiro Amadeu Susin,et al.  SoCIN: a parametric and scalable network-on-chip , 2003, 16th Symposium on Integrated Circuits and Systems Design, 2003. SBCCI 2003. Proceedings..

[55]  Edward A. Lee,et al.  Overview of the Ptolemy project , 2001 .

[56]  William B. Toms,et al.  Delay-insensitive, point-to-point interconnect using m-of-n codes , 2003, Ninth International Symposium on Asynchronous Circuits and Systems, 2003. Proceedings..

[57]  Jörg Henkel,et al.  Instruction-based system-level power evaluation of system-on-a-chip peripheral cores , 2002, IEEE Trans. Very Large Scale Integr. Syst..

[58]  John P. Knight,et al.  Optimizing Power in ASIC Behavioral Synthesis , 1996, IEEE Des. Test Comput..

[59]  Mary Jane Irwin,et al.  Techniques for low energy software , 1997, Proceedings of 1997 International Symposium on Low Power Electronics and Design.

[60]  Dana S. Henry,et al.  A tightly-coupled processor-network interface , 1992, ASPLOS V.

[61]  Mahmut T. Kandemir,et al.  Leakage Current: Moore's Law Meets Static Power , 2003, Computer.

[62]  Shubhendu S. Mukherjee,et al.  Coherent Network Interfaces for Fine-Grain Communication , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[63]  Sarita V. Adve,et al.  Shared Memory Consistency Models: A Tutorial , 1996, Computer.

[64]  Duncan H. Lawrie,et al.  Access and Alignment of Data in an Array Processor , 1975, IEEE Transactions on Computers.

[65]  Marios C. Papaefthymiou,et al.  HyPE: hybrid power estimation for IP-based programmable systems , 2003, ASP-DAC '03.

[66]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[67]  Guy E. Blelloch,et al.  NESL: A Nested Data-Parallel Language , 1992 .

[68]  Abraham Waksman,et al.  A Permutation Network , 1968, JACM.

[69]  Mark J. Karol,et al.  Queueing in high-performance packet switching , 1988, IEEE J. Sel. Areas Commun..

[70]  Jörg Henkel,et al.  Avalanche: an environment for design space exploration and optimization of low-power embedded systems , 2002, IEEE Trans. Very Large Scale Integr. Syst..

[71]  Alberto L. Sangiovanni-Vincentelli,et al.  System design: traditional concepts and new paradigms , 1999, Proceedings 1999 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.99CB37040).

[72]  Calton Pu,et al.  A Lock-Free Multiprocessor OS Kernel , 1992, OPSR.

[73]  V. Benes On rearrangeable three-stage connecting networks , 1962 .

[74]  Behrooz Parhami,et al.  Periodically Regular Chordal Rings , 1999, IEEE Trans. Parallel Distributed Syst..

[75]  Alberto L. Sangiovanni-Vincentelli,et al.  Platform-Based Design and Software Design Methodology for Embedded Systems , 2001, IEEE Des. Test Comput..

[76]  Dr John Bainbridge Asynchronous System-on-Chip Interconnect , 2002, Distinguished Dissertations.

[77]  Wayne P. Burleson,et al.  NoCIC: a spice-based interconnect planning tool emphasizing aggressive on-chip interconnect circuit methods , 2004, SLIP '04.

[78]  Janak H. Patel,et al.  Processor-memory interconnections for multiprocessors , 1979, ISCA '79.

[79]  Alberto L. Sangiovanni-Vincentelli,et al.  Coping with Latency in SOC Design , 2002, IEEE Micro.

[80]  M. H. Schultz,et al.  Topological properties of hypercubes , 1988, IEEE Trans. Computers.

[81]  Andrew Lines,et al.  Asynchronous interconnect for synchronous SoC design , 2004, IEEE Micro.

[82]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[83]  Maurice Herlihy,et al.  A methodology for implementing highly concurrent data objects , 1993, TOPL.

[84]  Luca Benini,et al.  SystemC Cosimulation and Emulation of Multiprocessor SoC Designs , 2003, Computer.

[85]  Niraj K. Jha,et al.  SCALP: an iterative-improvement-based low-power data path synthesis system , 1997, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[86]  Frank K. Hwang,et al.  Reaching Fault Diagnosis Agreement under a Hybrid Fault Model , 2000, IEEE Trans. Computers.

[87]  Alan Gray,et al.  picoArray technology: the tool's story , 2005, Design, Automation and Test in Europe.

[88]  Jörg Henkel,et al.  A case study in networks-on-chip design for embedded video , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[89]  Miodrag Potkonjak,et al.  Optimizing power using transformations , 1995, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[90]  Guy Bois,et al.  A methodology for interfacing open source SystemC with a third party software , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[91]  Axel Jantsch,et al.  The Nostrum backbone-a communication protocol stack for Networks on Chip , 2004, 17th International Conference on VLSI Design. Proceedings..

[92]  Luciano Lavagno,et al.  Efficient power co-estimation techniques for system-on-chip design , 2000, DATE '00.

[93]  Ran Ginosar,et al.  QNoC: QoS architecture and design process for network on chip , 2004, J. Syst. Archit..

[94]  Sharad Malik,et al.  A hierarchical modeling framework for on-chip communication architectures , 2002, ICCAD 2002.

[95]  J.D. Day,et al.  The OSI reference model , 1983 .

[96]  Federico Silla,et al.  A comparative study of arbitration algorithms for the Alpha 21364 pipelined router , 2002, ASPLOS X.

[97]  de Ng Dick Bruijn A combinatorial problem , 1946 .

[98]  Chak-Kuen Wong,et al.  A Combinatorial Problem Related to Multimodule Memory Organizations , 1974, JACM.

[99]  Fernando Gehm Moraes,et al.  A Low Area Overhead Packet-switched Network on Chip: Architecture and Prototyping , 2003, VLSI-SOC.

[100]  C. R. Jesshope,et al.  The mad-postman network chip , 1991 .

[101]  Michael Burrows,et al.  Eraser: a dynamic data race detector for multithreaded programs , 1997, TOCS.

[102]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[103]  C. R. Jesshope,et al.  High performance communications in processor networks , 1989, ISCA '89.

[104]  Christer Svensson,et al.  Self-tested self-synchronization circuit for mesochronous clocking , 2001 .

[105]  Dake Liu,et al.  SoCBUS: switched network on chip for hard real time embedded systems , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[106]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[107]  Lars W. Liebmann,et al.  High-performance circuit design for the RET-enabled 65-nm technology node , 2004, SPIE Advanced Lithography.

[108]  Sharad Malik,et al.  Instruction level power analysis and optimization of software , 1996, J. VLSI Signal Process..

[109]  Ricardo Bianchini,et al.  Application Performance on the MIT Alewife Machine , 1996, Computer.

[110]  I. Soderquist Globally updated mesochronous design style , 2003 .

[111]  Frank K. Hwang,et al.  A complementary survey on double-loop networks , 2001, Theor. Comput. Sci..

[112]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[113]  Hoi-Jun Yoo,et al.  Low-power network-on-chip for high-performance SoC design , 2006, IEEE Trans. Very Large Scale Integr. Syst..

[114]  Srinivasan Murali,et al.  SUNMAP: a tool for automatic topology selection and generation for NoCs , 2004, Proceedings. 41st Design Automation Conference, 2004..

[115]  Sharad Malik,et al.  Orion: a power-performance simulator for interconnection networks , 2002, 35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings..

[116]  Radu Marculescu,et al.  Towards on-chip fault-tolerant communication , 2003, ASP-DAC '03.

[117]  David F. Bacon,et al.  Guava: a dialect of Java without data races , 2000, OOPSLA '00.

[118]  Pierre G. Paulin,et al.  StepNP: A System-Level Exploration Platform for Network Processors , 2002, IEEE Des. Test Comput..

[119]  Leslie G. Valiant,et al.  A Scheme for Fast Parallel Communication , 1982, SIAM J. Comput..

[120]  L. Oliker,et al.  Parallelization of a Dynamic Unstructured Application using Three Leading Paradigms , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[121]  Pradeep K. Dubey,et al.  How Multimedia Workloads Will Change Processor Design , 1997, Computer.

[122]  Donald Yeung,et al.  The MIT Alewife machine: architecture and performance , 1995, ISCA '98.

[123]  Simone Orcioni,et al.  Power analysis methodology and library in SystemC , 2005, SPIE Microtechnologies.

[124]  Bruce M. Maggs Randomly Wired Multistage Networks , 1993 .

[125]  Rami G. Melhem,et al.  A framework for the design, synthesis and cycle-accurate simulation of multiprocessor networks , 2005, J. Parallel Distributed Comput..

[126]  Alain Greiner,et al.  A generic architecture for on-chip packet-switched interconnections , 2000, DATE '00.

[127]  R.W. Brodersen,et al.  A dynamic voltage scaled microprocessor system , 2000, IEEE Journal of Solid-State Circuits.

[128]  William J. Dally,et al.  Programmable Stream Processors , 2003, Computer.

[129]  Wolfgang Rosenstiel,et al.  The simulation semantics of SystemC , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[130]  William J. Dally,et al.  The BlackWidow High-Radix Clos Network , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[131]  Allan Borodin,et al.  On the power of randomization in on-line algorithms , 2005, Algorithmica.

[132]  Thomas E. Anderson,et al.  High-speed switch scheduling for local-area networks , 1993, TOCS.

[133]  Kees G. W. Goossens,et al.  Deadlock Prevention in the Æthereal Protocol , 2005, CHARME.

[134]  N. Cohen,et al.  Soft error considerations for deep-submicron CMOS circuit applications , 1999, International Electron Devices Meeting 1999. Technical Digest (Cat. No.99CH36318).

[135]  Andreas Gerstlauer,et al.  RTOS modeling for system level design , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.