The CMS online cluster: IT for a large data acquisition and control cluster

The CMS online cluster consists of more than 2000 computers running about 10000 application instances. These applications implement the control of the experiment, the event building, the high level trigger, the online database and the control of the buffering and transferring of data to the Central Data Recording at CERN. In this paper the IT solutions employed to fulfil the requirements of such a large cluster are revised. Details are given on the chosen network structure, configuration management system, monitoring infrastructure and on the implementation of the high availability for the services and infrastructure.

[1]  D. Gigi,et al.  The Run Control System of the CMS Experiment , 2008 .

[2]  Steven R. Simon,et al.  The CMS event builder and storage system , 2010 .

[3]  Andrew Washbrook,et al.  Quattor: Tools and Techniques for the Configuration, Installation and Management of Large-Scale Grid Computing Fabrics , 2004, Journal of Grid Computing.

[4]  A. Oh,et al.  The CMS high level trigger , 2006, 2003 IEEE Nuclear Science Symposium. Conference Record (IEEE Cat. No.03CH37515).

[5]  Claudia-Elisabeth Wulz,et al.  The CMS experiment at CERN , 2005, SPIE Optics + Optoelectronics.

[6]  V. Boyer,et al.  CMS DAQ Event Builder Based on Gigabit Ethernet , 2006, 2007 15th IEEE-NPSS Real-Time Conference.

[7]  Dominique Gigi,et al.  The run control system of the CMS experiment , 2008 .

[8]  Stefan Spanier,et al.  The CMS experiment at the CERN LHC, CMS Collaboration , 2008 .