Optimizing distributed file storage and processing engines for CERN's Large Hadron Collider using multi criteria partitioned replication

Throughout the last decades, distributed file systems and processing engines have been the primary choice for applications requiring access to large amounts of data. Since the introduction of the MapReduce paradigm, relational databases are being increasingly replaced by more efficient and scalable architectures, in particular in environments where a query is expected to process TBytes or even PBytes of data in a single execution. That is the situation at CERN, where data storage systems that are critical for the safe operation, exploitation and optimization of the particle accelerator complex, are based on traditional databases or file system solutions, which are already working well beyond their initially provisioned capacity. Despite the efficiency of modern distributed data storage and processing engines in handling large amounts of data, they are not optimized for heterogeneous workloads such as they arise in the dynamic environment of one of the world's largest scientific facilities. This contribution presents a Mixed Partitioning Scheme Replication (MPSR) solution that outperforms the conventional distributed processing environment configurations at CERN for virtually the entire parameter space of the accelerator monitoring systems' workload variations. Our main strategy was to replicate the data using different partitioning schemes for each replica, whereas the individual partitioning criteria is dynamically derived from the observed workload. To assess the efficiency of this approach in a wide range of scenarios, a behavioral simulator has been developed to compare and analyze the performance of the MPSR with the current solution. Furthermore we present the first actual results of the Hadoop-based prototype running on a relatively small cluster that not only validates the simulation predictions but also confirms the higher efficiency of the proposed technique.

[1]  Dan Meng,et al.  Clover: A Distributed File System of Expandable Metadata Service Derived from HDFS , 2012, 2012 IEEE International Conference on Cluster Computing.

[2]  Jorge-Arnulfo Quiané-Ruiz,et al.  Trojan data layouts: right shoes for a running elephant , 2011, SoCC.

[3]  Abdul Quamar,et al.  SWORD: scalable workload-aware data placement for transactional workloads , 2013, EDBT '13.

[4]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[5]  Yun Tian,et al.  Improving MapReduce performance through data placement in heterogeneous Hadoop clusters , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[6]  Yuanyuan Tian,et al.  CoHadoop: Flexible Data Placement and Its Exploitation in Hadoop , 2011, Proc. VLDB Endow..

[7]  C Roderick,et al.  The CERN Accelerator Measurement Database: On The Road To Federation , 2011 .

[8]  Markus Zerlauth,et al.  Second Generation LHC Analysis Framework: Workload-based and User-oriented Solution , 2016 .

[9]  Thomas F. Wenisch,et al.  Minimizing Remote Accesses in MapReduce Clusters , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[10]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[11]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[12]  Andrea C. Arpaci-Dusseau,et al.  Analysis of HDFS under HBase: a facebook messages case study , 2014, FAST.

[13]  Tiago Ribeiro,et al.  Towards a Second Generation Data Analysis Framework for LHC Transient Data Recording , 2015 .

[14]  Inder Monga,et al.  Lambda architecture for cost-effective batch and speed big data processing , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[15]  Carlo Curino,et al.  Schism , 2010, Proc. VLDB Endow..

[16]  A. Kala Karun,et al.  A review on hadoop — HDFS infrastructure extensions , 2013, 2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES.

[17]  Markus Zerlauth,et al.  THE LHC POST MORTEM ANALYSIS FRAMEWORK , 2009 .