Preparations for the public release of high-level CMS data

Abstract The CMS Collaboration, in accordance with its commitment to open access and data preservation, is preparing for the public release of up to half of the reconstructed collision data collected in 2010. Efforts at present are focused on the usability of the data in education. The data will be accompanied by example applications tailored for different levels of access, including ready-to-use web-based applications for histogramming or visualising individual collision events and a virtual machine image of the CMS software environment that is compatible with these data. The virtual machine image will contain instructions for using the data with the online applications as well as examples of simple analyses. The novelty of this initiative is two-fold: in terms of open science, it lies in releasing the data in a format that is good for analysis; from an outreach perspective, it is to provide the possibility for people outside CMS to build educational applications using our public data. CMS will rely on services for data preservation and open access being prototyped at CERN with input from CMS and the other LHC experiments.