Models and Scheduling Algorithms for Mixed Data and Task Parallel Programs
暂无分享,去创建一个
James Demmel | Katherine A. Yelick | Soumen Chakrabarti | J. Demmel | K. Yelick | Soumen Chakrabarti
[1] Geoffrey C. Fox,et al. Runtime array redistribution in HPF programs , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.
[2] Ronald L. Graham,et al. Bounds for Multiprocessor Scheduling with Resource Constraints , 1975, SIAM J. Comput..
[3] Sachin S. Sapatnekar,et al. A Convex Programming Approach for Exploiting Data and Functional Parallelism on Distributed Memory Multicomputers , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.
[4] Jaspal Subhlok,et al. Optimal mapping of sequences of data parallel tasks , 1995, PPOPP '95.
[5] Ian Foster,et al. A compilation system that integrates High Performance Fortran and Fortran M , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.
[6] He Huang,et al. On the concurrency of C++ , 1993, Proceedings of ICCI'93: 5th International Conference on Computing and Information.
[7] R. Tarjan,et al. The analysis of a nested dissection algorithm , 1987 .
[8] R. C. Whaley,et al. LAPACK Working Note 73: Basic Linear Algebra Communication Subprograms: Analysis and Implementation Across Multiple Parallel Architectures , 1994 .
[9] Joseph W. H. Liu,et al. The Multifrontal Method for Sparse Matrix Solution: Theory and Practice , 1992, SIAM Rev..
[10] Jeffery D. Rutter. LAPACK Working Note 69: A Serial Implementation of Cuppen''s Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem , 1994 .
[11] Bernard Tourancheau,et al. Performance Complexity of LU Factorization with Efficient Pipelining and Overlap on a Multiprocessor , 1993 .
[12] W. Rudin. Real and complex analysis , 1968 .
[13] J. A. Spahr,et al. Parallelization and Distribution of a Coupled Atmosphere–Ocean General Circulation Model , 1993 .
[14] James Demmel,et al. ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance , 1995, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.
[15] Guy E. Blelloch,et al. Implementation of a portable nested data-parallel language , 1993, PPOPP '93.
[16] Jaspal Subhlok,et al. A new model for integrated nested task and data parallel programming , 1997, PPOPP '97.
[17] Ken Kennedy,et al. Compiling Fortran D for MIMD distributed-memory machines , 1992, CACM.
[18] Xiaobai Sun,et al. Parallel performance of a symmetric eigensolver based on the invariant subspace decomposition approach , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.
[19] Ronald L. Graham,et al. Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.
[20] J. Cuppen. A divide and conquer method for the symmetric tridiagonal eigenproblem , 1980 .
[21] Fikret Erçal,et al. Time-Efficient Maze Routing Algorithms on Reconfigurable Mesh Architectures , 1997, J. Parallel Distributed Comput..
[22] Rajeev Motwani,et al. Scheduling problems in parallel query optimization , 1995, PODS '95.
[23] Mihalis Yannakakis,et al. Towards an Architecture-Independent Analysis of Parallel Algorithms , 1990, SIAM J. Comput..
[24] Katherine Yelick,et al. Parallel timing simulation on a distributed memory multiprocessor , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).
[25] Prasoon Tiwari,et al. Scheduling malleable and nonmalleable parallel tasks , 1994, SODA '94.
[26] V. Sarkar,et al. Automatic partitioning of a program dependence graph into parallel tasks , 1991, IBM J. Res. Dev..
[27] Shang-Hua Teng,et al. High Performance FORTRAN for Highly Unstructured Problems. , 1997, ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming.
[28] Jack Dongarra,et al. A Proposal for a User-Level, Message-Passing Interface in a Distributed Memory Environment , 1993 .
[29] Tao Yang,et al. On the Granularity and Clustering of Directed Acyclic Task Graphs , 1993, IEEE Trans. Parallel Distributed Syst..
[30] Remzi H. Arpaci-Dusseau,et al. Empirical evaluation of the CRAY-T3D: a compiler perspective , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[31] Rice UniversityCORPORATE,et al. High performance Fortran language specification , 1993 .
[32] James Demmel,et al. The Performance of Finding Eigenvalues and Eigenvaectors of Dense Symmetric Matrices on Distributed Memory Computers , 1995, PPSC.
[33] Jaeyoung Choi,et al. A Proposal for a Set of Parallel Basic Linear Algebra Subprograms , 1995, PARA.
[34] Thomas R. Gross,et al. Exploiting task and data parallelism on a multicomputer , 1993, PPOPP '93.
[35] Scott B. Baden,et al. Programming Abstractions for Dynamically Partitioning and Coordinating Localized Scientific Calculations Running on Multiprocessors , 1991, SIAM J. Sci. Comput..
[36] Robert D. Blumofe,et al. Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.
[37] S. Muthukrishnan,et al. Resource scheduling for parallel database and scientific applications , 1996, SPAA '96.
[38] R. van de Geijn,et al. A look at scalable dense linear algebra libraries , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..
[39] J. Pasciak,et al. Computer solution of large sparse positive definite systems , 1982 .
[40] James Demmel,et al. Design of a Parallel Nonsymmetric Eigenroutine Toolbox, Part I , 1993, PPSC.