Grid Architectural Issues: State-of-the-art and Future Trends

Grid architecture is one of the cornerstones for successful development and proliferation of Grid computing. The scale, dynamism and openness of the Grid, together with demands on its reliability, security and manageability, pose unique challenges on software architecture. The main objective of the Architectural Issues Institute of the CoreGRID NoE is to perform a significant improvement of architectural designs of future Grids by focusing on three particular key aspects: scalability of resources, adaptability, and dependability of Grid architectures and services. These research directions address the mandatory architectural properties of the Next Generation Grids as identified by the NGG reports: simplicity, resilience, scalability of services, and straightforward administration and configuration management. This paper presents the current state-of-the-art on the research topics of the partners involved in the Architectural Issues Institute of the CoreGRID NoE, with special focus on scalable resource discovery, fault tolerance for Grid systems, and adaptability and performance predictions mechanisms for a self-manageable Grid infrastructure. A newly emerged area of research, the one of large-scale volunteer computing using desktop Grid platforms constitute an active area of research in this Institute. A significant problem is to render these platforms resilient to open up to commercial applications. Special focus is given to the future research trends on these topics as they emerge from the involvement of the CoreGRID partners. 1 This research work is carried out under the FP6 Network of Excellence CoreGRID funded by the European Commission (Contract IST-2002004265). 11 With the Department of Applied Informatics and Multimedia, Technological Educational Institute of Crete, Greece.

[1]  Emil C. Lupu,et al.  A Survey of Policy Specification Approaches , 2002 .

[2]  Rajkumar Buyya,et al.  A taxonomy and survey of grid resource management systems for distributed computing , 2002, Softw. Pract. Exp..

[3]  CreditCardType,et al.  Automatic Composition of Semantic Web Services , 2007 .

[4]  Alexander Reinefeld,et al.  Scalable and Self-Optimizing Data Grids , 2004 .

[5]  Miron Livny,et al.  A worldwide flock of Condors: Load sharing among workstation clusters , 1996, Future Gener. Comput. Syst..

[6]  Eugenio Zimeo,et al.  An economy-driven mapping heuristic for hierarchical master-slave applications in grid systems , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[7]  L. Ljung,et al.  Control theory : multivariable and nonlinear methods , 2000 .

[8]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[9]  Ian T. Foster,et al.  DiPerF: an automated distributed performance testing framework , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[10]  Paolo Traverso,et al.  Automated planning - theory and practice , 2004 .

[11]  Henri Casanova,et al.  Benchmark probes for grid assessment , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[12]  David Finkel,et al.  SLINC : A FRAMEWORK FOR VOLUNTEER COMPUTING , 2006 .

[13]  Domenico Talia,et al.  Implementing Dynamic Querying Search in k-ary DHT-based Overlays , 2008, CoreGRID Integration Workshop.

[14]  Nazareno Andrade,et al.  Labs of the World, Unite!!! , 2006, Journal of Grid Computing.

[15]  Eugenio Zimeo,et al.  A Framework for QoS-based Resource Brokering in Grid Computing , 2007, WEWST.

[16]  Morris Sloman,et al.  A survey of trust in internet applications , 2000, IEEE Communications Surveys & Tutorials.

[17]  Domenico Talia,et al.  Design and Implementation of a Hybrid P2P-based Grid Resource Discovery System , 2007, CoreGRID Workshop - Making Grids Work.

[18]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[19]  Domenico Talia,et al.  A super-peer model for resource discovery services in large-scale Grids , 2005, Future Gener. Comput. Syst..

[20]  Carlo Mastroianni,et al.  A SCALABLE ARCHITECTURE FOR DISCOVERY AND COMPOSITION IN P 2 P SERVICE NETWORKS , 2008 .

[21]  G. Alonso,et al.  Parallel computing patterns for Grid workflows , 2006, 2006 Workshop on Workflows in Support of Large-Scale Science.

[22]  Sébastien Tixeuil,et al.  An Overview of Existing Tools for Fault-Injection and Dependability Benchmarking in Grids , 2006 .

[23]  Ian Foster,et al.  The Security Architecture for Open Grid Services , 2002 .

[24]  Marios D. Dikaiakos,et al.  Failure Management in Grids: the Case of the EGEE Infrastructure , 2007, Parallel Process. Lett..

[25]  Aaron B. Brown Oops! Coping with Human Error in IT Systems , 2004, ACM Queue.

[26]  Oleg Lodygensky Contribution aux infrastructures de calcul global: délégation inter plates-formes, intégration de services standards et application à la physique des hautes énergies. (Contributing to global computing : inter platforms resource sharings, legacy services integration and high energy physics applicatio , 2006 .

[27]  Gheorghe Cosmin Silaghi,et al.  Reputation-based trust management systems and their applicability to grids , 2007 .

[28]  Miguel Castro,et al.  Debunking some myths about structured and unstructured overlays , 2005, NSDI.

[29]  Lakshminarayanan Subramanian,et al.  Root Cause Localization in Large Scale Systems , 2005 .

[30]  Baohua Wei Collaborative Data Distribution with BitTorrent for Computational Desktop Grids , 2005, The 4th International Symposium on Parallel and Distributed Computing (ISPDC'05).

[31]  Fabrizio Silvestri,et al.  A Grid Information Service Based on Peer-to-Peer , 2005, Euro-Par.

[32]  William H. Sanders,et al.  A dynamic replica selection algorithm for tolerating timing faults , 2001, 2001 International Conference on Dependable Systems and Networks.

[33]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[34]  Álvaro Enrique Arenas,et al.  Defeating Colluding Nodes in Desktop Grid Computing Platforms , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[35]  Nabil Abdennadher,et al.  A Scheduling Algorithm for High Performance Peer-to-Peer Platform , 2006, Euro-Par Workshops.

[36]  Bobby Bhattacharjee,et al.  Creating a Robust Desktop Grid using Peer-to-Peer Services , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[37]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM '04.

[38]  Bruno Sousa,et al.  Sabotage-tolerance and trust management in desktop grid computing , 2007, Future Gener. Comput. Syst..

[39]  Steven Tuecke,et al.  The Physiology of the Grid An Open Grid Services Architecture for Distributed Systems Integration , 2002 .

[40]  Manfred Broy,et al.  Modellbildung in der Informatik , 2004, Xpert.press.

[41]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[42]  Akhil Sahai,et al.  FEEDBACKFLOW-An Adaptive Workflow Generator for Systems Management , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[43]  Florian Schintke,et al.  On Adaptability in Grid Systems , 2004, Future Generation Grids.

[44]  Ian T. Foster Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, NPC.

[45]  Farnam Jahanian,et al.  ORCHESTRA: A Fault Injection Environment for Distributed Systems , 1996 .

[46]  Mario Cannataro,et al.  SIGMCC: A system for sharing meta patient records in a Peer-to-Peer environment , 2008, Future Gener. Comput. Syst..

[47]  Gilles Fedak XtremWeb : une plate-forme générique pour l'étude expérimentale du calcul global et pair-à-pair , 2003 .

[48]  Kang G. Shin,et al.  DOCTOR: an integrated software fault injection environment for distributed real-time systems , 1995, Proceedings of 1995 IEEE International Computer Performance and Dependability Symposium.

[49]  Audun Jøsang,et al.  A survey of trust and reputation systems for online service provision , 2007, Decis. Support Syst..

[50]  Marios D. Dikaiakos,et al.  GridBench: A tool for the interactive performance exploration of Grid infrastructures , 2007, J. Parallel Distributed Comput..

[51]  Armando Fox,et al.  Detecting application-level failures in component-based Internet services , 2005, IEEE Transactions on Neural Networks.

[52]  Marios D. Dikaiakos,et al.  Identifying Failures in Grids through Monitoring and Ranking , 2008, 2008 Seventh IEEE International Symposium on Network Computing and Applications.

[53]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[54]  Carlo Mastroianni,et al.  A Scalable Architecture For Discovery And Planning In P2P Service Networks , 2008, CoreGRID Integration Workshop.

[55]  Amin Vahdat,et al.  Scalable Wide-Area Resource Discovery , 2004 .

[56]  Marios D. Dikaiakos,et al.  Grid Resource Ranking Using Low-Level Performance Measurements , 2007, Euro-Par.

[57]  Steven C. Wheelwright,et al.  Forecasting methods and applications. , 1979 .

[58]  Thomas Hérault,et al.  Computing on large-scale distributed systems: XtremWeb architecture, programming models, security, tests and convergence with grid , 2005, Future Gener. Comput. Syst..

[59]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[60]  Ravishankar K. Iyer,et al.  NFTAPE: a framework for assessing dependability in distributed systems with lightweight fault injectors , 2000, Proceedings IEEE International Computer Performance and Dependability Symposium. IPDS 2000.

[61]  Steven C. Wheelwright,et al.  Forecasting: Methods and Applications, 3rd Edition , 1998 .

[62]  Domenico Talia,et al.  Peer-to-peer protocols and grid services for resource discovery on grids , 2004, High Performance Computing Workshop.

[63]  Domenico Talia,et al.  Dynamic Querying in Structured Peer-to-Peer Networks , 2008, DSOM.

[64]  Ian Taylor,et al.  Cache-Enabled Super-Peer Overlays for Multiple Job Submission on Grids , 2008 .

[65]  Ian Foster,et al.  A peer-to-peer approach to resource location in grid environments , 2002 .

[66]  Keith Marzullo,et al.  The virtue of dependent failures in multi-site systems , 2005 .

[67]  Dan S. Wallach,et al.  A Survey of Peer-to-Peer Security Issues , 2002, ISSS.

[68]  Artur Andrzejak,et al.  Scalable, efficient range queries for grid information services , 2002, Proceedings. Second International Conference on Peer-to-Peer Computing,.

[69]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[70]  Marios D. Dikaiakos,et al.  Divide et Impera: Partitioning Unstructured Peer-to-Peer Systems to Improve Resource Location , 2006, CoreGRID Integration Workshop.

[71]  Timothy L. Harris,et al.  XenoSearch: distributed resource discovery in the XenoServer open platform , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[72]  Pedro A. Szekely,et al.  MAAN: A Multi-Attribute Addressable Network for Grid Information Services , 2003, Proceedings. First Latin American Web Congress.

[73]  Scott Shenker,et al.  Making gnutella-like P2P systems scalable , 2003, SIGCOMM '03.

[74]  William H. Sanders,et al.  Loki: a state-driven fault injector for distributed systems , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[75]  Marios D. Dikaiakos,et al.  Nine months in the life of EGEE: a look from the South , 2007, 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[76]  Luís Moura Silva,et al.  A Fault-Injector Tool to Evaluate Failure Detectors in Grid-Services , 2007, CoreGRID Workshop - Making Grids Work.

[77]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[78]  Eliza Varney Distributed Management Task Force, Inc , 2010 .

[79]  Domenico Talia,et al.  Peer-to-Peer resource discovery in Grids: Models and systems , 2007, Future Gener. Comput. Syst..

[80]  Alfredo Vaccaro,et al.  Pervasive grid for large-scale power systems contingency analysis , 2006, IEEE Transactions on Industrial Informatics.

[81]  Salvatore Orlando,et al.  Resource Discovery in a Dynamic Grid Environment , 2005, 16th International Workshop on Database and Expert Systems Applications (DEXA'05).

[82]  Péter Kacsuk,et al.  Sztaki Desktop Grid: Building a Scalable, Secure Platform for Desktop Grid Computing , 2007, CoreGRID Workshop - Making Grids Work.

[83]  Giandomenico Spezzano,et al.  Antares: an ant-inspired P2P information system for a self-structured grid , 2007, 2007 2nd Bio-Inspired Models of Network, Information and Computing Systems.

[84]  Álvaro Enrique Arenas,et al.  Tackling the Collusion Threat in P2P-enhanced Internet Desktop Grids , 2007, CoreGRID Workshop - Making Grids Work.

[85]  Artur Andrzejak,et al.  Characterizing and Predicting Resource Demand by Periodicity Mining , 2005, Journal of Network and Systems Management.

[86]  Marios D. Dikaiakos,et al.  Failrank: Towards a Unified Grid Failure Monitoring and Ranking System , 2007, CoreGRID Workshop - Making Grids Work.