Challenging data management in CMS computing with network-aware systems

After a successful first run at the LHC, and during the Long Shutdown (LS1) of the accelerator, the workload and data management sectors of the CMS Computing Model are entering into an operational review phase in order to concretely assess area of possible improvements and paths to exploit new promising technology trends. In particular, since the preparation activities for the LHC start, the networks have constantly been of paramount importance for the execution of CMS workflows, exceeding the original expectations - as from the MONARC model - in terms of performance, stability and reliability. The low-latency transfers of PetaBytes of CMS data among dozens of WLCG Tiers worldwide using the PhEDEx dataset replication system is an example of the importance of reliable networks. Another example is the exploitation of WAN data access over data federations in CMS. A new emerging area of work is the exploitation of Intelligent Network Services, including also bandwidth on demand concepts. In this paper, we will review the work done in CMS on this, and the next steps.

[1]  Ciprian Dobre,et al.  MonALISA: An agent based, dynamic service system to monitor, control and optimize distributed systems , 2009, Comput. Phys. Commun..

[2]  Ricky Egeland,et al.  Data transfer infrastructure for CMS data taking , 2009 .

[3]  Natalia Ratnikova,et al.  Distributed data transfers in CMS , 2011 .

[4]  Iosif Legrand,et al.  Models Of Networked Analysis At Regional Centres For Lhc Experiments (monarc), Phase 2 Report, 24th March 2000 , 2000 .

[5]  Dorian Kcira,et al.  CMS computing operations during run 1 , 2014 .

[6]  Brian Bockelman,et al.  Scaling CMS data transfer system for LHC start-up , 2008 .

[7]  M. Giffels,et al.  Integration and validation testing for PhEDEx, DBS and DAS with the PhEDEx LifeCycle agent , 2014 .

[9]  Oliver Gutsche,et al.  WLCG scale testing during CMS data challenges , 2008 .

[10]  Natalia Ratnikova,et al.  CMS Space Monitoring , 2014 .

[11]  Tony Wildish,et al.  Re-designing the PhEDEx Security Model , 2014 .

[12]  Jamie Shiers,et al.  The Worldwide LHC Computing Grid (worldwide LCG) , 2007, Comput. Phys. Commun..

[13]  C. Collaboration,et al.  CMS Data Processing Workflows during an Extended Cosmic Ray Run , 2009, 0911.4842.

[14]  Daniele Spiga,et al.  The CMS Remote Analysis Builder (CRAB) , 2007, HiPC.

[15]  S. Lacaprara,et al.  Distributed computing grid experiences in CMS , 2005, IEEE Transactions on Nuclear Science.

[16]  J. Yarba,et al.  CMS Data Processing Workflows during an Extended Cosmic Ray Run , 2010 .

[17]  Nicolo Magini,et al.  Improving CMS data transfers among its distributed computing facilities , 2011 .

[18]  João Paulo Teixeira,et al.  The CMS experiment at the CERN LHC , 2008 .

[19]  Barry Blumenfeld,et al.  Opportunistic Resource Usage in CMS , 2014 .

[20]  D. Bonacorsi Towards the operation of INFN Tier-1 for CMS: Lessons learned from CMS Data Challenge (DC04) , 2006 .

[21]  M. Giffels,et al.  The CMS Data Management System , 2014 .

[22]  T Wildish,et al.  Challenging data and workload management in CMS Computing with network-aware systems , 2014 .

[23]  Aaron Brown,et al.  Deployment of a WLCG network monitoring infrastructure based on the perfSONAR-PS technology , 2014 .

[24]  D. Colling,et al.  CMS computing model evolution , 2014 .

[25]  D. Bonacorsi,et al.  CMS results in the Combined Computing Readiness Challenge CCRC'08 , 2009 .

[26]  Kenneth Bloom CMS Use of a Data Federation , 2014 .