A Design of Pipeline Chain Algorithm Based on Circuit Switching for MPI Broadcast Communication System
暂无分享,去创建一个
[1] Henry M. Levy,et al. A comparison of message passing and shared memory architectures for data parallel programs , 1994, ISCA '94.
[2] Danyao Wang,et al. MPI as an abstraction for software-hardware interaction for HPRCs , 2008, 2008 Second International Workshop on High-Performance Reconfigurable Computing Technology and Applications.
[3] Paul Chow,et al. The challenges of using an embedded MPI for hardware-based processing nodes , 2009, 2009 International Conference on Field-Programmable Technology.
[4] Philip Heidelberger,et al. Optimization of MPI collective communication on BlueGene/L systems , 2005, ICS '05.
[5] A. Skjellum,et al. eMPI/eMPICH: embedding MPI , 1996, Proceedings. Second MPI Developer's Conference.
[6] Robert A. van de Geijn,et al. Building a high-performance collective communication library , 1994, Proceedings of Supercomputing '94.
[7] Veljko M. Milutinovic,et al. Hardware approaches to cache coherence in shared-memory multiprocessors, Part 1 , 1994, IEEE Micro.
[8] R. Rabenseifner,et al. Automatic MPI Counter Profiling of All Users: First Results on a CRAY T3E 900-512 , 2004 .
[9] Paul Marchal,et al. Flexible hardware/software support for message passing on a distributed shared memory architecture , 2005, Design, Automation and Test in Europe.
[10] P. Stenstrom. A survey of cache coherence schemes for multiprocessors , 1990, Computer.
[11] Luca Benini,et al. Networks on Chips : A New SoC Paradigm , 2022 .
[12] Paul Chow,et al. TMD-MPI: An MPI Implementation for Multiple Processors Across Multiple FPGAs , 2006, 2006 International Conference on Field Programmable Logic and Applications.
[13] Sathish S. Vadhiyar,et al. Automatically Tuned Collective Communications , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[14] Veljko M. Milutinovic,et al. Hardware approaches to cache coherence in shared-memory multiprocessors. 2 , 1994, IEEE Micro.
[15] William J. Dally,et al. Principles and Practices of Interconnection Networks , 2004 .
[16] Robert A. van de Geijn,et al. A Pipelined Broadcast for Multidimensional Meshes , 1995, Parallel Process. Lett..
[17] Rajeev Thakur,et al. Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..
[18] D. G. Payne,et al. Broadcasting on Meshes with Worm-hole Routing , 1996 .
[19] Robert A. van de Geijn,et al. Broadcasting on Meshes with Wormhole Routing , 1996, J. Parallel Distributed Comput..
[20] Javier Castillo,et al. Cluster architecture based on low cost reconfigurable hardware , 2008, 2008 International Conference on Field Programmable Logic and Applications.