论文信息 - Workload and network-optimized computing systems

Workload and network-optimized computing systems

This paper describes a recent system-level trend toward the use of massive on-chip parallelism combined with efficient hardware accelerators and integrated networking to enable new classes of applications and computing-systems functionality. This system transition is driven by semiconductor physics and emerging network-application requirements. In contrast to general-purpose approaches, workload and network-optimized computing provides significant cost, performance, and power advantages relative to historical frequency-scaling approaches in a serial computational model. We highlight the advantages of on-chip network optimization that enables efficient computation and new services at the network edge of the data center. Software and application development challenges are presented, and a service-oriented architecture application example is shown that characterizes the power and performance advantages for these systems. We also discuss a roadmap for next-generation systems that proportionally scale with future networking bandwidth growth rates and employ 3-D chip integration methods for design flexibility and modularity.

[1] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[2] H. Peter Hofstee,et al. Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..

[3] Katsuyuki Sakuma,et al. 3D chip stacking with C4 technology , 2008, IBM J. Res. Dev..

[4] Qing Wang,et al. Wireless network cloud: Architecture and system requirements , 2010, IBM J. Res. Dev..

[5] Sanjay Ghemawat,et al. MapReduce: simplified data processing on large clusters , 2008, CACM.

[6] N. Gura,et al. UltraSPARC T2: A highly-treaded, power-efficient, SPARC SOC , 2007, 2007 IEEE Asian Solid-State Circuits Conference.

[7] David A. Patterson,et al. Computer Architecture - A Quantitative Approach (4. ed.) , 2007 .

[8] H. Franke,et al. Introduction to the wire-speed processor and architecture , 2010, IBM J. Res. Dev..

[9] Norman C. Strole,et al. BladeCenter networking , 2005, IBM J. Res. Dev..

[10] Yuan Xie,et al. Processor Design in 3D Die-Stacking Technologies , 2007, IEEE Micro.

[11] David A. Patterson,et al. Computer Architecture, Fifth Edition: A Quantitative Approach , 2011 .

[12] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .

[13] Kunle Olukotun,et al. The Future of Microprocessors , 2005, ACM Queue.

[14] Hao Yu,et al. Exploiting heterogeneous multicore-processor systems for high-performance network processing , 2010, IBM J. Res. Dev..

[15] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[16] Timothy G. Mattson,et al. Patterns for parallel programming , 2004 .

[17] Muli Ben-Yehuda,et al. Applying Amdahl's Other Law to the data center , 2009, IBM J. Res. Dev..