A Risk Evaluation Model of PC Cluster with Seismic Data Processing

This study's objective is to solve performance evaluation problem of PC cluster in processing seismic data intensive computing. The objective of performance evaluation is determining the relationship between the number of PC node and the scale of tasks to be processed to decide whether the transaction of seismic data processing should be transformed from mainframe computer to PC cluster platform. In particular, this paper presented a model based on least-squares regression for evaluating the performance of PC cluster. The model begins with the comparison of the computing performance of PC clusters by collecting experiential data, and proceeded by establishing performance evaluation model of PC cluster using least-squares regression, as well as performing error analysis and significance test. To study the effects of the proposed model, a series of experiments based on PC cluster are designed and implemented. The results show that the computational efforts, communication traffic and their relationship play an important role in task processing of PC cluster. This study's conclusions indicate that PC cluster can be substituted for mainframe computer in order to save the expense of processing seismic data.

[1]  Brian Russell Zero distance and infinite resources , 2003 .

[2]  Ian Foster THE GRID: Computing without Bounds.: Computing without Bounds. , 2003 .

[3]  Yunhao Liu,et al.  Effectively Utilizing Global Cluster Memory for Large Data-Intensive Parallel Programs , 2006, IEEE Trans. Parallel Distributed Syst..

[4]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[5]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[6]  David E. Culler,et al.  The ganglia distributed monitoring system: design, implementation, and experience , 2004, Parallel Comput..

[7]  Yakup Paker,et al.  Concurrent communication and granularity assessment for a transputer-based multiprocessor system , 1990, Comput. Syst. Sci. Eng..

[8]  David S. Greenberg,et al.  A System Software Architecture for High End Computing , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[9]  William Gropp,et al.  MPI-2: Extending the Message-Passing Interface , 1996, Euro-Par, Vol. I.

[10]  Ian T. Foster,et al.  MPICH-G2: A Grid-enabled implementation of the Message Passing Interface , 2002, J. Parallel Distributed Comput..

[11]  David E. Culler,et al.  A case for NOW (networks of workstation) , 1995, PODC '95.

[12]  Jeffrey S. Vetter,et al.  Communication characteristics of large-scale scientific applications for contemporary cluster architectures , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[13]  Thomas L. Sterling,et al.  BEOWULF: A Parallel Workstation for Scientific Computation , 1995, ICPP.

[14]  Ian Foster,et al.  The Grid: A New Infrastructure for 21st Century Science , 2002 .

[15]  Nong Xiao,et al.  SDPG: Spatial data processing grid , 2008, Journal of Computer Science and Technology.

[16]  Al Geist,et al.  PVM (Parallel Virtual Machine) , 2011, Encyclopedia of Parallel Computing.

[17]  D. Klepacki High performance computing at IBM: The Bluegene/L supercomputer , 2003 .

[18]  David E. Culler,et al.  A case for NOW (networks of workstation) , 1995, PODC '95.

[19]  Rajkumar Buyya,et al.  Economic-based Distributed Resource Management and Scheduling for Grid Computing , 2002, ArXiv.

[20]  Craig J. Beasley Beyond the ‘more data, faster computers’ syndrome , 2003 .