A pipelined schedule to minimize completion time for loop tiling with computation and communication overlapping
暂无分享,去创建一个
Nectarios Koziris | Georgios I. Goumas | Aristidis Sotiropoulos | N. Koziris | G. Goumas | A. Sotiropoulos
[1] François Irigoin,et al. Supernode partitioning , 1988, POPL '88.
[2] Matthias A. Blumrich. Network interface for protected, user-level communication , 1996 .
[3] Weijia Shang,et al. Independent Partitioning of Algorithms with Uniform Dependencies , 1992, IEEE Trans. Computers.
[4] Jingling Xue,et al. On Tiling as a Loop Transformation , 1997, Parallel Process. Lett..
[5] Dhabaleswar K. Panda,et al. Design Alternatives for Virtual Interface Architecture and an Implementation on IBM Netfinity NT Cluster , 2001, J. Parallel Distributed Comput..
[6] Hermann Hellwagner,et al. SISCI - Implementing a Standard Software Infrastructure on an SCI Cluster , 1997 .
[7] Nectarios Koziris,et al. Chain Grouping: A Method for Partitioning Loops onto Mesh-Connected Processor Arrays , 2000, IEEE Trans. Parallel Distributed Syst..
[8] Erik H. D'Hollander,et al. Partitioning and Labeling of Loops by Unimodular Transformations , 1992, IEEE Trans. Parallel Distributed Syst..
[9] Nectarios Koziris,et al. Evaluation of loop grouping methods based on orthogonal projection spaces , 2000, Proceedings 2000 International Conference on Parallel Processing.
[10] Andrew A. Chien,et al. Software overhead in messaging layers: where does the time go? , 1994, ASPLOS VI.
[11] Hiroshi Tezuka,et al. The design and implementation of zero copy MPI using commodity hardware with a high performance network , 1998, ICS '98.
[12] Jingling Xue,et al. Communication-Minimal Tiling of Uniform Dependence Loops , 1996, J. Parallel Distributed Comput..
[13] Yves Robert,et al. (Pen)-ultimate tiling? , 1994, Integr..
[14] Knut Omang,et al. VIA over SCI - consequences of a zero copy implementation, and comparison with VIA over myrinet , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[15] David A. Patterson,et al. Computer Organization & Design: The Hardware/Software Interface , 1993 .
[16] Weijia Shang,et al. Time Optimal Linear Schedules for Algorithms with Uniform Dependencies , 1991, IEEE Trans. Computers.
[17] Nectarios Koziris,et al. Optimal Scheduling for UET/UET-UCT Generalized n-Dimensional Grid Task Graphs , 1999, J. Parallel Distributed Comput..
[18] Thorsten von Eicken,et al. U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.
[19] Weijia Shang,et al. On Supernode Transformation with Minimized Total Running Time , 1998, IEEE Trans. Parallel Distributed Syst..
[20] Wolfgang Rehm,et al. Memory Management in a Combined VIA/SCI Hardware , 2000, IPDPS Workshops.