Multi-toroidal Interconnects: Using Additional Communication Links to Improve Utilization of Parallel Computers

Three-dimensional torus is a common topology of network interconnects of multicomputers due to its simplicity and high scalability. A parallel job submitted to a three-dimensional toroidal machine typically requires an isolated, contiguous, rectangular partition connected as a mesh or a torus. Such partitioning leads to fragmentation and thus reduces resource utilization of the machines. In particular, toroidal partitions often require allocation of additional communication links to close the torus. If the links are treated as dedicated resources (due to the partition isolation requirement) this may prevent allocation of other partitions that could, otherwise, use those links. Overall, on toroidal machines, the likelihood of successful allocation of a new partition decreases as the number of toroidal partitions increases. This paper presents a novel ”multi-toroidal” interconnect topology that is able to accommodate multiple adjacent meshed and toroidal partitions at the same time. We prove that this topology allows connecting every free partition of the machine as a torus without affecting existing partitions. We also show that for toroidal jobs this interconnect topology increases machine utilization by a factor of 2 to 4 (depending on the workload) compared with three-dimensional toroidal machines. This effect exists for different scheduling policies. The BlueGene/L supercomputer being developed by IBM Research is an example of a multi-toroidal interconnect architecture.

[1]  Laxmi N. Bhuyan,et al.  An Adaptive Submesh Allocation Strategy for Two-Dimensional Mesh Connected Systems , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[2]  Dror G. Feitelson,et al.  Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling , 2001, IEEE Trans. Parallel Distributed Syst..

[3]  Dror G. Feitelson,et al.  Improved Utilization and Responsiveness with Gang Scheduling , 1997, JSSPP.

[4]  Lionel M. Ni,et al.  Efficient processor allocation for 3D tori , 1995, Proceedings of 9th International Parallel Processing Symposium.

[5]  Nian-Feng Tzeng,et al.  An efficient submesh allocation strategy for mesh computer systems , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[6]  Dhiraj K. Pradhan,et al.  A fast and efficient strategy for submesh allocation in mesh-connected parallel computers , 1993, Proceedings of 1993 5th IEEE Symposium on Parallel and Distributed Processing.

[7]  Francis J. Aguilar Cray Research, Inc , 2002 .

[8]  Yahui Zhu,et al.  Efficient Processor Allocation Strategie for Mesh-Connected Parallel Computers , 1992, J. Parallel Distributed Comput..

[9]  R. E. Kessler,et al.  Cray T3D: a new dimension for Cray Research , 1993, Digest of Papers. Compcon Spring.

[10]  José E. Moreira,et al.  Job Scheduling for the BlueGene/L System , 2002, JSSPP.

[11]  Hee Yong Youn,et al.  Processor Scheduling and Allocation for 3D Torus Multicomputer Systems , 2000, IEEE Trans. Parallel Distributed Syst..

[12]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.

[13]  José E. Moreira,et al.  Job Scheduling for the BlueGene/L System (Research Note) , 2002, Euro-Par.

[14]  Chita R. Das,et al.  Processor Management Techniques for Mesh-Connected Multiprocessors , 1995, ICPP.

[15]  David F. Heidel,et al.  An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).