Managing Long Running Queries in Grid Environment

Exceptionally large amounts of both distributed data and computational resources are becoming available through the Grid. This will enable efficient exchange and processing of very large amounts of data combined with CPU intensive computations, as required by many scientific applications. We propose a customizable Grid-based query processor built on top of an established Grid infrastructure, NorduGrid. It allows users to submit queries wrapping user-defined long running operations that filter and transform distributed customized data. Limitations imposed by the used Grid infrastructure influence the architecture. For example, resource requirements have to be specified before Grid jobs are started and delays may occur based on the availability of required resources for a job. We are developing a fully functional prototype to investigate the viability of the approach and its applicability. Our first application area is Particle Physics where scientists analyze huge amount of data produced by a collider or simulators to identify particles.

[1]  Michael Stonebraker,et al.  Object-Relational DBMSs: Tracking the Next Great Wave , 1998 .

[2]  Michael Stonebraker,et al.  Object-Relational DBMSs: The Next Great Wave , 1995 .

[3]  Tore Risch,et al.  Distributed data integration by object‐oriented mediator servers , 2001, Concurr. Comput. Pract. Exp..

[4]  Warren Smith,et al.  A Resource Management Architecture for Metacomputing Systems , 1998, JSSPP.

[5]  Ahmed K. Elmagarmid,et al.  Object-Oriented Multidatabase Systems: A Solution for Advanced Applications , 1995 .

[6]  Enhanced Entity-relationship and Uml Modeling , .

[7]  Manish Parashar,et al.  Grid Computing — GRID 2002 , 2002, Lecture Notes in Computer Science.

[8]  Fons Rademakers,et al.  ROOT — An object oriented data analysis framework , 1997 .

[9]  Ian T. Foster,et al.  Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[10]  Patrick Valduriez,et al.  Principles of distributed database systems (2nd ed.) , 1999 .

[11]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[12]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[13]  Tore Risch,et al.  Query Decomposition for a Distributed Object-Oriented Mediator System , 2002, Distributed and Parallel Databases.

[14]  Tore Risch,et al.  Amos II User's Manual Amos II Beta Release 5 , 2004 .

[15]  F. Moortgat,et al.  Trilepton+top signal from chargino-neutralino decays of MSSM charged Higgs bosons at the LHC , 2003, hep-ph/0303093.

[16]  Norman W. Paton,et al.  Adaptive Query Processing: A Survey , 2002, BNCOD.

[17]  Michael Kifer,et al.  Deductive and Object-Oriented Databases , 1991 .

[18]  Hector Garcia-Molina,et al.  Main Memory Database Systems: An Overview , 1992, IEEE Trans. Knowl. Data Eng..

[19]  Tore Risch,et al.  Query processing over object views of relational data , 1997, The VLDB Journal.

[20]  Laura M. Haas,et al.  Optimizing Queries Across Diverse Data Sources , 1997, VLDB.

[21]  P. Eerola,et al.  Atlas Data-Challenge 1 on NorduGrid , 2003 .

[22]  Jim Smith,et al.  Distributed Query Processing on the Grid , 2002, GRID.

[23]  Calton Pu,et al.  An Adaptive Object-Oriented Approach to Integration and Access of Heterogeneous Information Sources , 1997, Distributed and Parallel Databases.

[24]  Jennifer Widom,et al.  Querying Semistructured Heterogeneous Information , 1995, J. Syst. Integr..

[25]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[26]  Hui Lin,et al.  Adaptive Data Mediation over XML Data , 2002 .

[27]  Luc Bouganim,et al.  Processing queries with expensive functions and large objects in distributed mediator systems , 2001, Proceedings 17th International Conference on Data Engineering.

[28]  T. Y. Cliff Leung,et al.  IBM DB2 Everyplace: a small footprint relational database system , 2001, Proceedings 17th International Conference on Data Engineering.

[29]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[30]  Farid Ould-Saada,et al.  The NorduGrid Architecture and Middleware for Scientific Applications , 2003, International Conference on Computational Science.

[31]  Jennifer Widom,et al.  The TSIMMIS Approach to Mediation: Data Models and Languages , 1997, Journal of Intelligent Information Systems.

[32]  Gio Wiederhold,et al.  Mediators in the architecture of future information systems , 1992, Computer.

[33]  F. Rademakers,et al.  ROOT — An object oriented data analysis framework , 1997 .

[34]  Patrick Valduriez,et al.  Scaling Access to Heterogeneous Data Sources with DISCO , 1998, IEEE Trans. Knowl. Data Eng..