Beyond the Performance of Three-Tier Fat-Tree: Equality Topology with Low Diameter

We introduced a novel interconnect topology named Equality with high-performance and low diameter. Equality is designed based on chordal ring networks. It advances previous discussed chordal ring topologies by a set of systematic linking strategies and routing rules. The Equality topology can be used construct low diameter networks with reasonably low router radices. Equality interconnects are highly symmetric and hence cabling rule and routing logic are simple. Compared with other networks, the Equality topology is flexible in total number of routers, where any even number is allowed. Equality can be applied in many applications including supercomputing, data center, cloud service, and enterprise cluster solutions. We evaluated Equality's performance using open-source BookSim 2.0 package. The benchmarks of 10 traffic models for the system constructed using 40-port switches are presented to assess the network performance, compared with the popular 3-tier fat-tree (3-T FT) structure. This case mimics the network architecture and the parameters of the switches of the top #1 supercomputer, Summit. These results show that Equality networks perform better than 3-T FTs with lower latency under five large-packet-size simulations (LPSS) by 3 traffic models: uniform, neighbor and transpose. The latency is better than 3-T FT in four of five LPSSs by bitrev and randperm and also in three of five LPSSs by bitrot and shuffle. The third group of the results has lower latency compared with that of fat-tree in three hot-spot and three non-uniform traffic conditions in all five LPSSs larger than 16 flits. The zero-load latency of Equality networks are lower than that of 3-T FT under the same simulation constraints.

[1]  Nan Jiang,et al.  A detailed and flexible cycle-accurate Network-on-Chip simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[2]  Piotr Kiedrowski,et al.  Analysis of Modified Fifth Degree Chordal Rings , 2012 .

[3]  William J. Dally,et al.  Flattened butterfly: a cost-efficient topology for high-radix networks , 2007, ISCA '07.

[4]  L.E. Moser,et al.  Fault-tolerant orthogonal fat-trees as interconnection networks , 1995, Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing.

[5]  R. E. Kessler,et al.  Cray T3D: a new dimension for Cray Research , 1993, Digest of Papers. Compcon Spring.

[6]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[8]  Holger Fröning,et al.  An Overview of MPI Characteristics of Exascale Proxy Applications , 2017, ISC.

[9]  Peter J. Denning,et al.  Exponential laws of computing growth , 2016, Commun. ACM.

[10]  Charles E. Leiserson,et al.  Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.

[11]  Gregory F. Pfister,et al.  “Hot spot” contention and combining in multistage interconnection networks , 1985, IEEE Transactions on Computers.

[12]  Torsten Hoefler,et al.  Slim Fly: A Cost Effective Low-Diameter Network Topology , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[13]  William J. Dally,et al.  Technology-Driven, Highly-Scalable Dragonfly Topology , 2008, 2008 International Symposium on Computer Architecture.