A framework for scheduling parallel dbms user-defined programs on an attached high-performance computer

We describe a software framework for deploying, scheduling and executing parallel DBMS user-defined programs on an attached high-performance computer (HPC) platform. This framework is advantageous for many DBMS workloads in the following two aspects. First, the long-running user-defined programs can be speeded up by taking advantage of the greater hardware parallel-ism available on the attached HPC platform. Second, the interac-tive response time of the remaining applications on the database server platform is improved by the off-loading of long-running user-defined programs to the attached HPC platform. Our frame-work provides a new approach for integrating high-performance computing into the workflow of query-oriented, computationally-intensive applications.

[1]  Amos Bairoch,et al.  Serendipity in bioinformatics, the tribulations of a Swiss bioinformatician through exciting times! , 2000, Bioinform..

[2]  Laura M. Haas,et al.  DiscoveryLink: A system for integrated access to life sciences data sources , 2001, IBM Syst. J..

[3]  Arun K. Sood,et al.  A fine-grain architecture for relational database aggregation operations , 1991, IEEE Micro.

[4]  Susie Stephens,et al.  Oracle Database 10g: a platform for BLAST search and Regular Expression pattern matching in life sciences , 2004, Nucleic Acids Res..

[5]  Brian E. Smith,et al.  Massively Parallel BLAST for the Blue Gene / L , 2005 .

[6]  Christos Faloutsos,et al.  Active Disks for Large-Scale Data Processing , 2001, Computer.

[7]  Wu-chun Feng,et al.  The design, implementation, and evaluation of mpiBLAST , 2003 .

[8]  Ophir Frieder,et al.  Exploiting parallelism in pattern matching: an information retrieval application , 1991, TOIS.

[9]  Victor Mak,et al.  VLSI accelerators for large database systems , 1991, IEEE Micro.

[10]  P. Faudemay,et al.  An associative accelerator for large databases , 1991, IEEE Micro.

[11]  Donald D. Chamberlin,et al.  A Complete Guide to DB2 Universal Database , 1998 .

[12]  George L.-T. Chiu,et al.  Overview of the Blue Gene/L system architecture , 2005, IBM J. Res. Dev..

[13]  Thomas L. Casavant,et al.  Three Complementary Approaches to Parallelization of Local BLAST Service on Workstation Clusters (invited paper) , 1999, PaCT.

[14]  Tzy-Hwa Kathy Tzeng,et al.  Scalability Comparison of Bioinformatics for Applications on AIX and Linux on IBM eServer pSeries 690 , 2004 .

[15]  Nagiza F. Samatova,et al.  Efficient data access for parallel BLAST , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[16]  William R. Pearson Protein sequence comparison and protein evolution , 1995, ISMB 1995.

[17]  M. Boguski The turning point in genome research. , 1995, Trends in biochemical sciences.

[18]  Jacek Becla,et al.  Lessons Learned from Managing a Petabyte , 2005, CIDR.

[19]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[20]  Thomas L. Casavant,et al.  Parallelization of local BLAST service on workstation clusters , 2001, Future Gener. Comput. Syst..