论文信息 - Reconfigurable hybrid interconnection for static and dynamic scientific applications

Reconfigurable hybrid interconnection for static and dynamic scientific applications

As we enter the era of peta-scale computing, system architects must plan for machines composed of tens or even hundreds of thousands of processors. Although fully connected networks such as fat-tree configurations currently dominate HPC interconnect designs, such approaches are inadequate for ultra-scale concurrencies due to the superlinear growth of component costs. Traditional low-degree interconnect topologies, such as 3D tori, have reemerged as a competitive solution due to the linear scaling of system components relative to the node count; however, such networks are poorly suited for the requirements of many scientific applications at extreme concurrencies. To address these limitations, we propose HFAST, a hybrid switch architecture that uses circuit switches to dynamically reconfigure lower-degree interconnects to suit the topological requirements of a given scientific application. This work presents several new research contributions. We develop an optimization strategy for HFAST mappings and demonstrate that efficiency gains can be attained across a broad range of static numerical computations. Additionally, we conduct an extensive analysis of the communication characteristics of a dynamically adapting mesh calculation and show that the HFAST approach can achieve significant advantages, even when compared with traditional fat-tree configurations. Overall results point to the promising potential of utilizing hybrid reconfigurable networks to interconnect future peta-scale architectures, for both static and dynamically adapting applications.

[1] Michael Lang,et al. A Performance and Scalability Analysis of the BlueGene/L Architecture , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[2] R. Luijten,et al. An Optical Packet-Switched Interconnect for Supercomputer Applications ∗ , 2004 .

[3] Z. Lin,et al. Size scaling of turbulent transport in magnetically confined plasmas. , 2002, Physical review letters.

[4] Burkhard Monien. The complexity of embedding graphs into binary trees , 1985, FCT.

[5] Joel Mambretti,et al. Optical Switching Middleware for the OptlPuter , 2003 .

[6] J. Qiang,et al. A parallel particle-in-cell model for beam-beam interaction in high energy ring colliders , 2004 .

[7] Leonid Oliker,et al. Integrated performance monitoring of a cosmology application on leading HEC platforms , 2005, 2005 International Conference on Parallel Processing (ICPP'05).

[8] John B. Bell,et al. Parallelization of structured, hierarchical adaptive mesh refinement algorithms , 2000 .

[9] Philip Heidelberger,et al. Optimizing task layout on the Blue Gene/L supercomputer , 2005, IBM J. Res. Dev..

[10] Leonid Oliker,et al. Performance Evaluation of Scientific Applications on Modern Parallel Vector Systems , 2006, VECPAR.

[11] Rami G. Melhem,et al. On the Feasibility of Optical Circuit Switching for High Performance Computing Systems , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[12] Leonid Oliker,et al. Scientific Computations on Modern Parallel Vector Systems , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[13] Leonid Oliker,et al. Analyzing Ultra-Scale Application Communication Requirements for a Reconfigurable Hybrid Interconnect , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[14] J. Haas,et al. Interaction of weak shock waves with cylindrical and spherical gas inhomogeneities , 1987, Journal of Fluid Mechanics.

[15] Vipul Gupta,et al. Performance analysis of a synchronous, circuit-switched interconnection cached network , 1994, ICS '94.

[16] Linda Vahala,et al. Lattice Boltzmann Model for Dissipative Incompressible MHD , 2001 .