Integration of Cloud resources in the LHCb Distributed Computing

This contribution describes how Cloud resources have been integrated in the LHCb Distributed Computing. LHCb is using its specific Dirac extension (LHCbDirac) as an interware for its Distributed Computing. So far, it was seamlessly integrating Grid resources and Computer clusters. The cloud extension of DIRAC (VMDIRAC) allows the integration of Cloud computing infrastructures. It is able to interact with multiple types of infrastructures in commercial and institutional clouds, supported by multiple interfaces (Amazon EC2, OpenNebula, OpenStack and CloudStack) – instantiates, monitors and manages Virtual Machines running on this aggregation of Cloud resources. Moreover, specifications for institutional Cloud resources proposed by Worldwide LHC Computing Grid (WLCG), mainly by the High Energy Physics Unix Information Exchange (HEPiX) group, have been taken into account. Several initiatives and computing resource providers in the eScience environment have already deployed IaaS in production during 2013. Keeping this on mind, pros and cons of a cloud based infrasctructure have been studied in contrast with the current setup. As a result, this work addresses four different use cases which represent a major improvement on several levels of our infrastructure. We describe the solution implemented by LHCb for the contextualisation of the VMs based on the idea of Cloud Site. We report on operational experience of using in production several institutional Cloud resources that are thus becoming integral part of the LHCb Distributed Computing resources. Furthermore, we describe as well the gradual migration of our Service Infrastructure towards a fully distributed architecture following the Service as a Service (SaaS) model.

[1]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[2]  Achim Streit,et al.  Evaluation of x32-ABI in the Context of LHC Applications , 2013, ICCS.

[3]  Joel Closier,et al.  LHCbDirac: Distributed computing in LHCb , 2012 .

[4]  Paul Nilsson,et al.  Overview of ATLAS PanDA Workload Management , 2011 .

[5]  Henning Schulzrinne,et al.  Signaling for Internet telephony , 1998, Proceedings Sixth International Conference on Network Protocols (Cat. No.98TB100256).

[6]  Thomas J. Hacker,et al.  A Methodology for Account Management in Grid Computing Environments , 2001, GRID.

[7]  Bob Jones,et al.  Strategic Plan for a Scientific Cloud Computing infrastructure for Europe , 2011 .

[8]  Roger Impey,et al.  A batch system for HEP applications on a distributed IaaS cloud , 2011 .

[9]  Roger Impey,et al.  Cloud Scheduler: a resource manager for distributed compute clouds , 2010, ArXiv.

[10]  D. Kelsey,et al.  Federated Identity Management for Research Collaborations , 2012 .

[11]  Tyler Moore,et al.  Economic Tussles in Federated Identity Management , 2012, WEIS.

[12]  Erik Vullings,et al.  A Trust-based Access Control Model for Virtual Organizations , 2006, 2006 Fifth International Conference on Grid and Cooperative Computing Workshops.

[13]  Joel Closier,et al.  Offline Processing in the Online Computer Farm , 2012 .

[14]  P. Buncic,et al.  CernVM – a virtual software appliance for LHC applications , 2010 .

[15]  Tomás F. Pena,et al.  The Integration of CloudStack and OCCI/OpenNebula with DIRAC , 2012 .

[16]  Daniel van der Ster,et al.  HammerCloud: A Stress Testing System for Distributed Analysis , 2011 .

[17]  Tadashi Maeno,et al.  The ATLAS PanDA Pilot in Operation , 2011 .

[18]  Víctor Méndez Muñoz,et al.  Rafhyc: an Architecture for Constructing Resilient Services on Federated Hybrid Clouds , 2013, Journal of Grid Computing.

[19]  Predrag Buncic,et al.  Distributing LHC application software and conditions databases using the CernVM file system , 2011 .

[20]  Andreas J. Peters,et al.  Exabyte Scale Storage at CERN , 2011 .

[21]  Mario Lassnig,et al.  Managing ATLAS data on a petabyte-scale with DQ2 , 2008 .

[22]  Ricardo Graciani,et al.  Status of the DIRAC Project , 2012 .

[23]  G. Aad,et al.  The ATLAS Experiment at the CERN Large Hadron Collide , 2008 .

[24]  Achim Streit,et al.  Reducing the Memory Footprint of Parallel Applications with KSM , 2012, Facing the Multicore-Challenge.