论文信息 - Performance benchmark of LHCb code on state-of-the-art x86 architectures

Performance benchmark of LHCb code on state-of-the-art x86 architectures

For Run 2 of the LHC, LHCb is replacing a significant part of its event filter farm with new compute nodes. For the evaluation of the best performing solution, we have developed a method to convert our high level trigger application into a stand-alone, bootable benchmark image. With additional instrumentation we turned it into a self-optimising benchmark which explores techniques such as late forking, NUMA balancing and optimal number of threads, i.e. it automatically optimises box-level performance. We have run this procedure on a wide range of Haswell-E CPUs and numerous other architectures from both Intel and AMD, including also the latest Intel micro-blade servers. We present results in terms of performance, power consumption, overheads and relative cost.

N. Neufeld | Rainer Schwemmer | D. Campora Perez

[1] Eric van Herwijnen,et al. Deferred High Level Trigger in LHCb: A Boost to CPU Resource Utilization , 2014 .

[2] Rainer Schwemmer,et al. Optimization of the HLT Resource Consumption in the LHCb Experiment , 2012 .

[3] Marco Cattaneo,et al. GAUDI-The Software Architecture and Framework for building LHCb data processing applications , 2000 .

[4] Marco Cattaneo,et al. GAUDI — A software architecture and framework for building HEP data processing applications , 2001 .

[5] Ben Couturier,et al. SIMD studies in the LHCb reconstruction software , 2015 .

[6] Radu Stoica,et al. LHCb Online event processing and filtering , 2008 .