CAMPUS GRIDS: A FRAME WORK TO FACILITATE RESOURCE SHARING

It is common at research institutions to maintain multiple clusters. These might fulfill different needs and policies, or represent different owners or generations of hardware. Many of these clusters are under utilized while researchers at other departments may require these resources. This may be solved by linking clusters with grid mid-dleware. This thesis describes a distributed high throughput computing framework to link clusters without changing security or execution environments. The framework initially keeps jobs local to the submitter, overflowing if necessary to the campus, and regional grid. The framework is implemented spanning two campuses at the Holland Computing Center. We evaluate the framework for five characteristics of campus grids. This framework is then further expanded to bridge campus grids into a regional grid, and overflow to national cyberinfrastructure. iii ACKNOWLEDGMENTS I would like to thank my advisor, David Swanson. I appreciated his guidance and patience while I implemented this thesis project. Also, I thank David for hiring me as an freshman, with little computing experience and giving me the support to learn and excel at scientific computing. I want to thank Dan Fraser for bringing this interesting project to me and giving me the opportunity to create my own solution to this problem. I want to thank Brian Bockelman for excellent technical advise during my time at HCC. He worked diligently with me on my thesis. I also want to thank all the people at the Holland Computing Center. I have broken their systems, asked for advise, and caused an untold amount of extra work for them. I especially want to thank my colleagues for the assistance they've provided over the years:

[1]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[2]  Miron Livny,et al.  Stork: making data placement a first class citizen in the grid , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[3]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[4]  Cecchi Marco,et al.  The gLite workload management system , 2008 .

[5]  Moreno Marzolla,et al.  The gLite Workload Management System , 2008, GPC.

[6]  I. Sfiligoi,et al.  Making science in the Grid world: using glideins to maximize scientific output , 2007, 2007 IEEE Nuclear Science Symposium Conference Record.

[7]  Rajesh Raman,et al.  High-throughput resource management , 1998 .

[8]  Russ Housley,et al.  An Internet Attribute Certificate Profile for Authorization , 2010, RFC.

[9]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[10]  Jeffrey I. Schiller,et al.  An Authentication Service for Open Network Systems. In , 1998 .

[11]  Jim Basney,et al.  CredEx: user-centric credential management for grid and Web services , 2005, IEEE International Conference on Web Services (ICWS'05).

[12]  Rajesh Raman,et al.  Matchmaking: distributed resource management for high throughput computing , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[13]  Miron Livny,et al.  A worldwide flock of Condors: Load sharing among workstation clusters , 1996, Future Gener. Comput. Syst..

[14]  Ian T. Foster,et al.  Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[15]  P. Mell,et al.  The NIST Definition of Cloud Computing , 2011 .

[16]  Igor Sfiligoi,et al.  glideinWMS - A generic pilot-based Workload Management System , 2008 .

[17]  Mahadev Satyanarayanan,et al.  Andrew: a distributed personal computing environment , 1986, CACM.

[18]  Daniel van der Ster,et al.  Commissioning of a CERN Production and Analysis Facility Based on xrootd , 2011 .

[19]  Igor Sfiligoi,et al.  FermiGrid—experience and future plans , 2008 .

[20]  Igor Sfiligoi,et al.  Use of glide-ins in CMS for production and analysis , 2010 .

[21]  Marty Humphrey,et al.  The University of Virginia Campus Grid: Integrating Grid Technologies with the Campus Information Infrastructure , 2005, EGC.

[22]  Robert Piro,et al.  Cream: a Simple, Grid-accessible, Job Management System for Local Computational Resources , 2006 .

[23]  Andrew Hanushevsky,et al.  XROOTD/TXNetFile: a highly scalable architecture for data access in the ROOT environment , 2005, ICT 2005.

[24]  Ricky Egeland,et al.  Data transfer infrastructure for CMS data taking , 2009 .

[25]  Igor Sfiligoi,et al.  CDF GlideinWMS usage in Grid computing of high energy physics , 2010 .

[26]  Jorge Luis Rodriguez,et al.  The Open Science Grid , 2005 .

[27]  Reagan Moore,et al.  The SDSC storage resource broker , 2010, CASCON.

[28]  T. Howes,et al.  LDAP: programming directory-enabled applications with lightweight directory access protocol , 1997 .

[29]  Paul Nilsson,et al.  Experience from a pilot based system for ATLAS , 2008 .

[30]  Thomas J. Hacker,et al.  Implementing an industrial-strength academic cyberinfrastructure at Purdue University , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[31]  Arcot Rajasekar,et al.  iRODS: A Distributed Data Management Cyberinfrastructure for Observatories , 2007 .

[32]  P. Elmer,et al.  XROOTD-A highly scalable architecture for data access , 2005 .

[33]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[34]  Farid Ould-Saada,et al.  Roadmap for the ARC Grid Middleware , 2006, PARA.