A Taxonomy of Data Management Models in Distributed and Grid Environments

The distributed environments vary largely in their architectures, from tightly coupled cluster environment to loosely coupled Grid environment and completely uncoupled peer-to-peer environment, and thus differ in their working environments as well as performance. To meet the specific needs of these environments for data organization, replication, transfer, scheduling etc. the data management systems implement different data management models. In this paper, major data management tasks in distributed environments are identified and a taxonomy of the data management models in these environments is presented. The taxonomy is used to highlight the specific data management requirements of each environment and highlight the strengths and weakness of the implemented data management models. The taxonomy is followed by a survey of different distributed and Grid environments and the data management models they implement. The taxonomy and the survey results are used to identify the issues and challenges of data management for future exploration.

[1]  Jun Qin,et al.  Advanced data flow support for scientific grid workflow applications , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[2]  Bruce M. Maggs,et al.  Globally Distributed Content Delivery , 2002, IEEE Internet Comput..

[3]  Nicholas R. Jennings,et al.  Brain Meets Brawn: Why Grid and Agents Need Each Other , 2004, Towards the Learning Grid.

[4]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[5]  Vijayalakshmi Atluri,et al.  Role-based Access Control , 1992 .

[6]  Brian D. Davison A Web Caching Primer , 2001, IEEE Internet Comput..

[7]  Masatoshi Ohishi,et al.  International Virtual Observatory Alliance , 2006, Proceedings of the International Astronomical Union.

[8]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[9]  Javier Jaén Martínez,et al.  Data Management in an International Data Grid Project , 2000, GRID.

[10]  James F. Doyle,et al.  Peer-to-Peer: harnessing the power of disruptive technologies , 2001, UBIQ.

[11]  Francine Berman,et al.  Overview of the Book: Grid Computing – Making the Global Infrastructure a Reality , 2003 .

[12]  Karl Aberer,et al.  P-Grid: a self-organizing structured P2P system , 2003, SGMD.

[13]  B. F. Spencer,et al.  Distributed hybrid earthquake engineering experiments: experiences with a ground-shaking grid application , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[14]  Balachander Krishnamurthy,et al.  On the use and performance of content distribution networks , 2001, IMW '01.

[15]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[16]  Satoshi Matsuoka,et al.  Grid Datafarm Architecture for Petascale Data Intensive Computing , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[17]  Wei Song,et al.  Data Grid Model Based on Structured P2P Overlay Network , 2007, APPT.

[18]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[19]  Rajkumar Buyya,et al.  A taxonomy of Data Grids for distributed data sharing, management, and processing , 2005, CSUR.

[20]  Ian T. Foster,et al.  The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets , 2000, J. Netw. Comput. Appl..

[21]  Arun Jagatheesan,et al.  Data grid management systems , 2003, SIGMOD '03.

[22]  William Stallings,et al.  Cryptography and Network Security: Principles and Practice , 1998 .

[23]  Esen A. Ozkarahan Database management - concepts, design and practice , 1990 .

[24]  Rajkumar Buyya,et al.  Peer-to-Peer Networks for Content Sharing , 2005 .

[25]  Ibm Redbooks Enabling Applications for Grid Computing With Globus , 2003 .