LHCb distributed data analysis on the computing grid

LHCb is one of the four Large Hadron Collider (LHC) experiments based at CERN, the European Organisation for Nuclear Research. The LHC experiments will start taking an unprecedented amount of data when they come online in 2007. Since no single institute has the compute resources to handle this data, resources must be pooled to form the Grid. Where the Internet has made it possible to share information stored on computers across the world, Grid computing aims to provide access to computing power and storage capacity on geographically distributed systems. LHCb software applications must work seamlessly on the Grid allowing users to efficiently access distributed compute resources. It is essential to the success of the LHCb experiment that physicists can access data from the detector, stored in many heterogeneous systems, to perform distributed data analysis. This thesis describes the work performed to enable distributed data analysis for the LHCb experiment on the LHC Computing Grid.

[1]  D. Box,et al.  Simple object access protocol (SOAP) 1.1 , 2000 .

[2]  Frank Leymann,et al.  Modeling Stateful Resources with Web Services , 2004 .

[3]  Rene Brun,et al.  PROOF - The Parallel ROOT Facility , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[4]  J Andreeva,et al.  The ARDA Prototypes , 2005 .

[5]  Ian T. Foster,et al.  Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[6]  N. Brook,et al.  DIRAC - Distributed Infrastructure with Remote Agent Control , 2003, ArXiv.

[7]  Eddy Caron,et al.  Evaluation of Meta-scheduler Architectures and Task Assignment Policies for High Throughput Computing , 2005 .

[8]  Paul V. Mockapetris,et al.  Domain names: Concepts and facilities , 1983, RFC.

[9]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[10]  I. Cavrak,et al.  The Virtual Laboratory project , 2000, ITI 2000. Proceedings of the 22nd International Conference on Information Technology Interfaces (Cat. No.00EX411).

[11]  David P. Anderson,et al.  BOINC: a system for public-resource computing and storage , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[12]  Ricardo Graciani,et al.  DIRAC, the LHCb Data Production and Distributed Analysis system , 2006 .

[13]  Donald F. Ferguson,et al.  From Open Grid Services Infrastructure to WS-Resource Framework: Refactoring and Evolution , 2004 .

[14]  Joel Closier,et al.  DIRAC Production Manager Tools , 2006 .

[15]  R. Graciani,et al.  DIRAC Security Infrastructure , 2006 .

[16]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[17]  Jim Basney,et al.  The MyProxy online credential repository , 2005, Softw. Pract. Exp..

[18]  Andrei Tsaregorodtsev,et al.  DIRAC Lightweight Information and Monitoring Services using XML-RPC and Instant Messaging , 2004 .

[19]  R. Jones,et al.  GANGA: A GRID USER INTERFACE , 2006 .

[20]  Iosif Legrand,et al.  Models Of Networked Analysis At Regional Centres For Lhc Experiments (monarc), Phase 2 Report, 24th March 2000 , 2000 .

[21]  Sergio Andreozzi GLUE SCHEMA IMPLEMENTATION FOR THE LDAP DATA MODEL , 2004 .

[22]  Ian T. Foster,et al.  The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[23]  D. L. Adams DIAL: Distributed Interactive Analysis of Large Datasets , 2003 .

[24]  Robert Metcalfe,et al.  Ethernet: distributed packet switching for local computer networks , 1988, CACM.

[25]  Ian Stokes-Rees,et al.  Developing LHCb Grid software: experiences and advances , 2007, Concurr. Comput. Pract. Exp..

[26]  Douglas Thain,et al.  Building Reliable Clients and Services , 2004, The Grid 2, 2nd Edition.

[27]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[28]  Andrei Tsaregorodtsev,et al.  Dirac Workload Management System , 2006 .

[29]  R. V. van Nieuwpoort,et al.  The Grid 2: Blueprint for a New Computing Infrastructure , 2003 .

[30]  Ákos Frohner,et al.  VOMS, an Authorization System for Virtual Organizations , 2003, European Across Grids Conference.

[31]  Vinton G. Cerf,et al.  A protocol for packet network intercommunication , 1974, CCRV.

[32]  Rajkumar Buyya,et al.  Weaving computational grids: how analogous are they with electrical grids? , 2002, Comput. Sci. Eng..