Study on Parallel Computing

In this paper, we present a general survey on parallel computing. The main contents include parallel computer system which is the hardware platform of parallel computing, parallel algorithm which is the theoretical base of parallel computing, parallel programming which is the software support of parallel computing. After that, we also introduce some parallel applications and enabling technologies. We argue that parallel computing research should form an integrated methodology of “architecture — algorithm — programming — application”. Only in this way, parallel computing research becomes continuous development and more realistic.

[1]  Ralph Duncan Parallel Computer Architectures , 1992, Adv. Comput..

[2]  Vipin Kumar,et al.  Isoefficiency: measuring the scalability of parallel algorithms and architectures , 1993, IEEE Parallel & Distributed Technology: Systems & Applications.

[3]  Laxmikant V. Kalé,et al.  A fault tolerant protocol for massively parallel systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[4]  Michael J. Quinn,et al.  Parallel programming in C with MPI and OpenMP , 2003 .

[5]  Qilong Zheng,et al.  GOOMPI: A Generic Object Oriented Message Passing Interface , 2004, NPC.

[6]  Ali R. Hurson,et al.  Scheduling and Load Balancing in Parallel and Distributed Systems , 1995 .

[7]  Werner C. Rheinboldt,et al.  Methods for Solving Systems of Nonlinear Equations: Second Edition , 1998 .

[8]  Richard Cole,et al.  The APRAM: incorporating asynchrony into the PRAM model , 1989, SPAA '89.

[9]  Jiulong Shan,et al.  Parallel Information Extraction on Shared Memory Multi-processor System , 2006, 2006 International Conference on Parallel Processing (ICPP'06).

[10]  Timothy G. Mattson,et al.  Patterns for parallel programming , 2004 .

[11]  William Gropp,et al.  PETSc 2.0 users manual , 2000 .

[12]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[13]  Gabriel Wittum,et al.  Parallel adaptive multigrid algorithm for 2-d 3-t diffusion equations , 2004, Int. J. Comput. Math..

[14]  Ishfaq Ahmad,et al.  Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors , 1996, IEEE Trans. Parallel Distributed Syst..

[15]  Steven J. Plimpton,et al.  Parallel Algorithms for Radiation Transport on Unstructured Grids , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[16]  Jeffrey Scott Vitter,et al.  Algorithms for parallel memory, II: Hierarchical multilevel memories , 1992, Algorithmica.

[17]  Steven Fortune,et al.  Parallelism in random access machines , 1978, STOC.

[18]  Yong Yan,et al.  Latency Metric: An Experimental Method for Measuring and Evaluating Parallel Program and Architecture Scalability , 1994, J. Parallel Distributed Comput..

[19]  Mo Zeyao,et al.  Multilevel averaging weight method for dynamic load imbalance problems , 2001 .

[20]  Bowen Alpern,et al.  A model for hierarchical memory , 1987, STOC.

[21]  David E. Bernholdt Parallel computational chemistry: an overview of NWchem , 2003 .

[22]  A.M. Wissink,et al.  Large Scale Parallel Structured AMR Calculations Using the SAMRAI Framework , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[23]  H. Rentz-Reichert,et al.  UG – A flexible software toolbox for solving partial differential equations , 1997 .

[24]  Yossi Matias,et al.  The QRQW PRAM: accounting for contention in parallel algorithms , 1994, SODA '94.

[25]  Vipin Kumar,et al.  Graph partitioning for high-performance scientific simulations , 2003 .

[26]  Kirk W. Cameron,et al.  Quantifying locality effect in data access delay: memory logP , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[27]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[28]  Xian-He Sun,et al.  Scalability of Parallel Algorithm-Machine Combinations , 1994, IEEE Trans. Parallel Distributed Syst..

[29]  Rob H. Bisseling,et al.  Parallel scientific computation - a structured approach using BSP and MPI , 2004 .

[30]  Zhiyong Li,et al.  Models and resource metrics for parallel and distributed computation , 1994, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[31]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[32]  Alok Aggarwal,et al.  Hierarchical memory with block transfer , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[33]  Zhang Yun DRAM(h):A Parallel Computation Model for High Performance Numerical Computing , 2003 .

[34]  E. Lewis,et al.  Computational Methods of Neutron Transport , 1993 .

[35]  Quentin F. Stout Minimizing peak energy on mesh-connected systems , 2006, SPAA '06.

[36]  H. Sagan Space-filling curves , 1994 .

[37]  Yossi Matias,et al.  Efficient low-contention parallel algorithms , 1994, SPAA '94.

[38]  Mo Zeyao,et al.  Parallel Flux Sweep Algorithm for Neutron Transport on Unstructured Grid , 2004 .

[39]  Ralf Deiterding,et al.  Load Balancing Strategies for Parallel SAMR Algorithms , 2005 .

[40]  Mo Zeyao,et al.  Dynamic Load Balancing for Short-range Parallel Molecular Dynamics Simulations , 2002 .

[41]  Joseph E. Flaherty,et al.  Hierarchical Partitioning and Dynamic Load Balancing for Scientific Computation , 2004, PARA.

[42]  Xiaolin Cao,et al.  A New Scalable Parallel Method for Molecular Dynamics Based on Cell-Block Data Structure , 2004, ISPA.

[43]  D. Keyes,et al.  Jacobian-free Newton-Krylov methods: a survey of approaches and applications , 2004 .

[44]  Tao Yang,et al.  On the Granularity and Clustering of Directed Acyclic Task Graphs , 1993, IEEE Trans. Parallel Distributed Syst..

[45]  Alok Aggarwal,et al.  On communication latency in PRAM computations , 1989, SPAA '89.

[46]  Joel H. Ferziger,et al.  Computational methods for fluid dynamics , 1996 .

[47]  Bharat K. Soni,et al.  Handbook of Grid Generation , 1998 .

[48]  John L. Gustafson,et al.  Reevaluating Amdahl's law , 1988, CACM.

[49]  Guy E. Blelloch,et al.  Parallel algorithms , 1996, CSUR.

[50]  Robert D. Falgout,et al.  The Design and Implementation of hypre, a Library of Parallel High Performance Preconditioners , 2006 .

[51]  Bowen Alpern,et al.  The uniform memory hierarchy model of computation , 2005, Algorithmica.

[52]  Xiaolin Cao,et al.  Towards a parallel framework of grid-based numerical algorithms on DAGs , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[53]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[54]  Leslie M. Goldschlager,et al.  A universal interconnection pattern for parallel computers , 1982, JACM.