HMPI: towards a message-passing library for heterogeneous networks of computers
暂无分享,去创建一个
[1] Luc Bougé,et al. A Portable and Adaptative Multi-protocol Communication Library for Multithreaded Runtime Systems , 2000, IPDPS Workshops.
[2] F. Pellegrini,et al. Static mapping by dual recursive bipartitioning of process architecture graphs , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.
[3] Ramesh Subramonian,et al. LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.
[4] Sandhya Dwarkadas,et al. Dynamic adaptation to available resources for parallel computing in an autonomous network of workstations , 2001, PPoPP '01.
[5] Anthony A. Maciejewski,et al. Task Matching and Scheduling in Heterogenous Computing Environments Using a Genetic-Algorithm-Based Approach , 1997, J. Parallel Distributed Comput..
[6] Jameela Al-Jaroodi,et al. Modeling parallel applications performance on heterogeneous systems , 2003, Proceedings International Parallel and Distributed Processing Symposium.
[7] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.
[8] Chris J. Scheiman,et al. LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation , 1995, SPAA '95.
[9] Bruce Hendrickson,et al. A Multi-Level Algorithm For Partitioning Graphs , 1995, Proceedings of the IEEE/ACM SC95 Conference.
[10] Andrew S. Grimshaw,et al. The Legion vision of a worldwide virtual computer , 1997, Commun. ACM.
[11] David E. Culler,et al. A case for NOW (networks of workstation) , 1995, PODC '95.
[12] Ed Anderson,et al. LAPACK users' guide - [release 1.0] , 1992 .
[13] Howard Jay Siegel,et al. A dynamic matching and scheduling algorithm for heterogeneous computing systems , 1998, Proceedings Seventh Heterogeneous Computing Workshop (HCW'98).
[14] Alexey L. Lastovetsky,et al. An Approach to Assessment of Heterogeneous Parallel Algorithms , 2003, PaCT.
[15] Alexey Lastovetsky,et al. A language approach to high performance computing on heterogeneous networks , 2001 .
[16] Ming Wu,et al. Memory conscious task partition and scheduling in grid environments , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.
[17] Michael W. Godfrey,et al. An overview of MSHN: the Management System for Heterogeneous Networks , 1999, Proceedings. Eighth Heterogeneous Computing Workshop (HCW'99).
[18] Harold S. Stone,et al. Critical Load Factors in Two-Processor Distributed Systems , 1978, IEEE Transactions on Software Engineering.
[19] Robert A. van de Geijn,et al. Scalability Issues Affecting the Design of a Dense Linear Algebra Library , 1994, J. Parallel Distributed Comput..
[20] Jack J. Dongarra,et al. A set of level 3 basic linear algebra subprograms , 1990, TOMS.
[21] Anthony Skjellum,et al. The Parallel Mathematical Libraries Project (PMLP): Overview, Design Innovations, and Preliminary Results , 1999, PaCT.
[22] Bruce Hendrickson,et al. The Chaco user`s guide. Version 1.0 , 1993 .
[23] R. F. Freund,et al. Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems , 1999, Proceedings. Eighth Heterogeneous Computing Workshop (HCW'99).
[24] Katherine A. Yelick,et al. Portable Parallel Irregular Applications , 1995, PSLS.
[25] J. Ramanujam,et al. Memory-Constrained Communication Minimization for a Class of Array Computations , 2002, LCPC.
[26] Guy E. Blelloch,et al. A practical comparison of N-body algorithms , 1994, Parallel Algorithms.
[27] Francine Berman,et al. Application-Level Scheduling on Distributed Heterogeneous Networks , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[28] Mohammed J. Zaki,et al. Compile-Time Scheduling Algorithms for a Heterogeneous Network of Workstations , 1997, Comput. J..
[29] Ian T. Foster,et al. Managing Multiple Communication Methods in High-Performance Networked Computing Systems , 1997, J. Parallel Distributed Comput..
[30] Adrianos Lachanas,et al. MPI-FT: Portable Fault Tolerance Scheme for MPI , 2000, Parallel Process. Lett..
[31] Pat Morin. Coarse grained parallel computing on heterogeneous systems , 1998, SAC '98.
[32] Paolo Palazzari,et al. Real time pipelined system design through simulated annealing , 1996, J. Syst. Archit..
[33] Yong Yan,et al. Modeling and characterizing parallel computing performance on heterogeneous networks of workstations , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.
[34] William E. Johnston,et al. Grids as production computing environments: the engineering aspects of NASA's Information Power Grid , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).
[35] Francine Berman,et al. Program Speedup in a Heterogeneous Computing Network , 1994, J. Parallel Distributed Comput..
[36] Harold S. Stone,et al. Multiprocessor Scheduling with the Aid of Network Flow Algorithms , 1977, IEEE Transactions on Software Engineering.
[37] Sathish S. Vadhiyar,et al. Towards an Accurate Model for Collective Communications , 2004, Int. J. High Perform. Comput. Appl..
[38] Jack J. Dongarra,et al. HARNESS and fault tolerant MPI , 2001, Parallel Comput..
[39] Alexey Lastovetsky,et al. AN OVERVIEW OF HETEROGENEOUS HIGH PERFORMANCE AND GRID COMPUTING , 2004 .
[40] Füsun Özgüner,et al. Dynamic, competitive scheduling of multiple DAGs in a distributed heterogeneous environment , 1998, Proceedings Seventh Heterogeneous Computing Workshop (HCW'98).
[41] Shuichi Ichikawa,et al. An execution-time estimation model for heterogeneous clusters , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[42] Jorge G. Barbosa,et al. Simulation of data distribution strategies for LU factorization on heterogeneous machines , 2003, Proceedings International Parallel and Distributed Processing Symposium.
[43] Baruch Awerbuch,et al. An Opportunity Cost Approach for Job Assignment in a Scalable Computing Cluster , 2000, IEEE Trans. Parallel Distributed Syst..
[44] Xiaodong Zhang,et al. Erratum: "An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW" , 1997, J. Parallel Distributed Comput..
[45] Li Xiao,et al. Dynamic Cluster Resource Allocations for Jobs with Known and Unknown Memory Demands , 2002, IEEE Trans. Parallel Distributed Syst..
[46] Andrea Clematis,et al. Modeling performance of heterogeneous parallel computing systems , 1999, Parallel Comput..
[47] Sajal K. Das,et al. Graph partitioning for parallel applications in heterogeneous Grid environments , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.
[48] Alexey L. Lastovetsky,et al. On performance analysis of heterogeneous parallel algorithms , 2004, Parallel Comput..
[49] Franck Cappello,et al. HiHCoHP-Toward a realistic communication model for hierarchical hyperclusters of heterogeneous processors , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[50] Sathish S. Vadhiyar,et al. Automatically Tuned Collective Communications , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[51] Philip J. Hatcher,et al. Data-Parallel Programming on MIMD Computers , 1991, IEEE Trans. Parallel Distributed Syst..
[52] Francine Berman,et al. Adaptive Computing on the Grid Using AppLeS , 2003, IEEE Trans. Parallel Distributed Syst..
[53] Alexey Lastovetsky,et al. Towards a Realistic Performance Model for Networks of Heterogeneous Computers , 2005 .
[54] Ladislau Bölöni,et al. A comparison study of static mapping heuristics for a class of meta-tasks on heterogeneous computing systems , 1999, Proceedings. Eighth Heterogeneous Computing Workshop (HCW'99).
[55] Gary L. Miller,et al. A unified geometric approach to graph separators , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.
[56] Anthony Skjellum,et al. A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..
[57] Jaeyoung Choi,et al. Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines , 1994, Sci. Program..
[58] Massachusett Framingham,et al. The Common Object Request Broker: Architecture and Specification Version 3 , 2003 .
[59] Jack Dongarra,et al. PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing , 1995 .
[60] R. M. Mattheyses,et al. A Linear-Time Heuristic for Improving Network Partitions , 1982, 19th Design Automation Conference.
[61] Leslie G. Valiant,et al. A bridging model for parallel computation , 1990, CACM.
[62] Chris Walshaw,et al. Mesh Partitioning: A Multilevel Balancing and Refinement Algorithm , 2000, SIAM J. Sci. Comput..
[63] Lalit M. Patnaik,et al. Genetic algorithms: a survey , 1994, Computer.
[64] Alexey Lastovetsky. Parallel computing on heterogeneous networks , 2003 .
[65] Tamara G. Kolda,et al. Partitioning Rectangular and Structurally Unsymmetric Sparse Matrices for Parallel Processing , 1999, SIAM J. Sci. Comput..
[66] Anthony Skjellum,et al. MPI/FT/sup TM/: architecture and taxonomies for fault-tolerant, message-passing middleware for performance-portable parallel computing , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.
[67] Yuefan Deng,et al. New trends in high performance computing , 2001, Parallel Computing.
[68] Alexey L. Lastovetsky,et al. Adaptive parallel computing on heterogeneous networks with mpC , 2002, Parallel Comput..
[69] David E. Culler,et al. A case for NOW (networks of workstation) , 1995, PODC '95.
[70] Viktor K. Prasanna,et al. Efficient collective communication in distributed heterogeneous systems , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).
[71] Ümit V. Çatalyürek,et al. Decomposing Irregularly Sparse Matrices for Parallel Matrix-Vector Multiplication , 1996, IRREGULAR.
[72] Vipin Kumar,et al. A New Algorithm for Multi-objective Graph Partitioning , 1999, Euro-Par.
[73] George Karypis,et al. Introduction to Parallel Computing , 1994 .
[74] Kees Verstoep,et al. Fast Measurement of LogP Parameters for Message Passing Platforms , 2000, IPDPS Workshops.
[75] Yan Alexander Li,et al. Minimizing the Application Execution Time Through Scheduling of Subtasks and Communication Traffic in a Heterogeneous Computing System , 1997, IEEE Trans. Parallel Distributed Syst..
[76] Greg Burns,et al. LAM: An Open Cluster Environment for MPI , 2002 .
[77] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[78] P. Raghavan. Line and plane separators , 1993 .
[79] Vipin Kumar,et al. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..
[80] Debasish Ghose,et al. Scheduling Divisible Loads in Parallel and Distributed Systems , 1996 .
[81] Chris Peterson,et al. Implementing a Performance Forecasting System for Metacomputing The Network Weather Service , 1997, ACM/IEEE SC 1997 Conference (SC'97).
[82] Carl Kesselman,et al. A Network Performance Tool for Grid Environments , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[83] Vipin Kumar,et al. A Unified Algorithm for Load-balancing Adaptive Scientific Simulations , 2000, ACM/IEEE SC 2000 Conference (SC'00).
[84] Yves Robert,et al. Partitioning a Square into Rectangles: NP-Completeness and Approximation Algorithms , 2002, Algorithmica.
[85] Viktor K. Prasanna,et al. Block‐cyclic redistribution over heterogeneous networks , 2004, Cluster Computing.
[86] Rossen Dimitrov,et al. Overlapping of Communication and Computation and Early Binding: Fundamental Mechanisms for Improving , 2001 .
[87] Michael G. Norman,et al. Models of machines and computation for mapping in multicomputers , 1993, CSUR.
[88] John K. Antonio,et al. Software support for heterogeneous computing , 1996, CSUR.
[89] Brian W. Kernighan,et al. An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..
[90] Bruce Hendrickson,et al. An Improved Spectral Load Balancing Method , 1993, PPSC.
[91] Dhabaleswar K. Panda,et al. Efficient collective communication on heterogeneous networks of workstations , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).
[92] Nikolay N. Mirenkov,et al. Self-Explanatory Components: A New Programming Paradigm , 2001, Int. J. Softw. Eng. Knowl. Eng..
[93] Elizabeth A. Post,et al. Evaluating the parallel performance of a heterogeneous system , 2001 .
[94] Henri E. Bal,et al. Bandwidth-efficient collective communication for clustered wide area systems , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[95] Francine Berman,et al. Modeling the effects of contention on the performance of heterogeneous applications , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.
[96] Patrick Ciarlet,et al. On the validity of a front-oriented approach to partitioning large sparse graphs with a connectivity constraint , 2005, Numerical Algorithms.
[97] Sanjay Ranka,et al. Array Decompositions for Nonuniform Computational Environments , 1996, J. Parallel Distributed Comput..
[98] Bruce Hendrickson,et al. The Torus-Wrap Mapping for Dense Matrix Calculations on Massively Parallel Computers , 1994, SIAM J. Sci. Comput..
[99] Ming Wu,et al. Grid Harvest Service: a system for long-term, application-level task scheduling , 2003, Proceedings International Parallel and Distributed Processing Symposium.
[100] Bruce Lowekamp,et al. ECO: Efficient Collective Operations for communication on heterogeneous networks , 1996, Proceedings of International Conference on Parallel Processing.
[101] Yves Robert,et al. Heterogeneity Considered Harmful to Algorithm Designers , 2000, CLUSTER.
[102] Curt Jones,et al. A Heuristic for Reducing Fill-In in Sparse Matrix Factorization , 1993, PPSC.
[103] Alexey L. Lastovetsky,et al. Heterogeneous Distribution of Computations Solving Linear Algebra Problems on Networks of Heterogeneous Computers , 2001, J. Parallel Distributed Comput..
[104] Laxmikant V. Kale,et al. Object-Based Adaptive Load Balancing for MPI Programs∗ , 2000 .
[105] Saman Amarasinghe,et al. The suif compiler for scalable parallel machines , 1995 .
[106] Ian T. Foster,et al. Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..
[107] Henri Casanova,et al. NetSovle: A Network Server for Solving Computational Science Problems , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[108] Sivan Toledo,et al. A survey of out-of-core algorithms in numerical linear algebra , 1999, External Memory Algorithms.
[109] Bruce Hendrickson,et al. Skewed Graph Partitioning , 1997, PP.
[110] Horst D. Simon,et al. Fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems , 1994, Concurr. Pract. Exp..
[111] R. F. Freund,et al. Dynamic Mapping of a Class of Independent Tasks onto Heterogeneous Computing Systems , 1999, J. Parallel Distributed Comput..
[112] Ian Foster,et al. The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.
[113] Tiffani L. Williams,et al. A general-purpose model for heterogeneous computation , 2000 .
[114] Pawel Wolniewicz,et al. Out-of-Core Divisible Load Processing , 2003, IEEE Trans. Parallel Distributed Syst..
[115] João Gabriel Silva,et al. WMPI - Message Passing Interface for Win32 Clusters , 1998, PVM/MPI.
[116] Vipin Kumar,et al. Parallel Multilevel Algorithms for Multi-constraint Graph Partitioning (Distinguished Paper) , 2000, Euro-Par.
[117] Michael J. Quinn,et al. Block data decomposition for data-parallel programming on a heterogeneous workstation network , 1993, [1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing.
[118] Tiffani L. Williams,et al. The Heterogeneous Bulk Synchronous Parallel Model , 2000, IPDPS Workshops.
[119] Andy C. Downton,et al. Development of a fine-grained parallel Karhunen Loève transform , 2004, J. Parallel Distributed Comput..
[120] R. F. Freund,et al. Scheduling resources in multi-user, heterogeneous, computing environments with SmartNet , 1998, Proceedings Seventh Heterogeneous Computing Workshop (HCW'98).
[121] Steven Fortune,et al. Parallelism in random access machines , 1978, STOC.
[122] Stephen R. Schach,et al. A Shifting Algorithm for Min-Max Tree Partitioning , 1980, JACM.
[123] David Fernández-Baca,et al. Allocating Modules to Processors in a Distributed System , 1989, IEEE Trans. Software Eng..
[124] Jack J. Dongarra,et al. Performance Analysis of MPI Collective Operations , 2005, IPDPS.
[125] J. Pasciak,et al. Computer solution of large sparse positive definite systems , 1982 .
[126] Csaba Andras Moritz,et al. LoGPC: modeling network contention in message-passing programs , 1998, SIGMETRICS '98/PERFORMANCE '98.
[127] Jack J. Dongarra,et al. Algorithmic Redistribution Methods for Block-Cyclic Decompositions , 1999, IEEE Trans. Parallel Distributed Syst..
[128] Min-You Wu,et al. A high-performance mapping algorithm for heterogeneous computing systems , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.
[129] Miron Livny,et al. Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.
[130] Pawel Wolniewicz,et al. Divisible Load Scheduling in Systems with Limited Memory , 2004, Cluster Computing.
[131] Vipin Kumar,et al. Multilevel Algorithms for Multi-Constraint Graph Partitioning , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[132] Vipin Kumar,et al. Multilevel k-way hypergraph partitioning , 1999, DAC '99.
[133] Vipin Kumar,et al. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm , 1997, PP.
[134] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[135] T. von Eicken,et al. Parallel programming in Split-C , 1993, Supercomputing '93.
[136] Alex Pothen,et al. PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .
[137] James Demmel,et al. ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, PARA.
[138] Jack Dongarra,et al. MPI: The Complete Reference , 1996 .
[139] Yves Robert,et al. Matrix Multiplication on Heterogeneous Platforms , 2001, IEEE Trans. Parallel Distributed Syst..
[140] Richard Wolski,et al. Predicting the CPU availability of time‐shared Unix systems on the computational grid , 2004, Cluster Computing.
[141] Jack Dongarra,et al. ScaLAPACK Users' Guide , 1987 .
[142] Jorge G. Barbosa,et al. Linear algebra algorithms in a heterogeneous cluster of personal computers , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).
[143] Alexey Lastovetsky,et al. A parallel language and its programming system for heterogeneous networks , 2000 .
[144] Xian-He Sun. Scalability versus Execution Time in Scalable Systems , 2002, J. Parallel Distributed Comput..
[145] Stephen R. Schach,et al. Max-Min Tree Partitioning , 1981, JACM.
[146] Yves Robert,et al. A Proposal for a Heterogeneous Cluster ScaLAPACK (Dense Linear Solvers) , 2001, IEEE Trans. Computers.
[147] Ümit V. Çatalyürek,et al. Decomposing Linear Programs for Parallel Solution , 1995, PARA.
[148] Cosimo Anglano,et al. Predicting parallel applications performance on non-dedicated cluster platforms , 1998, ICS '98.