• Help
  •    
  • About
    • CERN Open Data
    • ALICE
    • ATLAS
    • CMS
    • LHCb
    • OPERA
    • Glossary

About CMS

Documentation About


The Compact Muon Solenoid (CMS) Experiment is one of the large particle detectors at CERN's Large Hadron Collider. The CMS Collaboration consists of more than 3000 scientists, engineers, technicians and students from 180+ institutes and universities from 40+ countries. You can find more information about the CMS detector design and overview on the official CMS website.

You can find usage instructions and suggestions of CMS Open Data in two detailed guides:

  • Guide to education use of CMS Open Data
  • Guide to research use of CMS Open Data.

This page gives a brief overview of CMS Open Data contents:

  1. CMS Data and analysis tools
  2. Primary and simulated datasets
  3. Disclaimer
  4. Other CMS open data
  5. Policies

CMS Data and analysis tools

The following are provided through this portal:

  • Downloadable datasets
    • Primary datasets: full reconstructed collision data with no other selections. The data here are referred to as "reconstructed data"; fragmented data from various sub-detectors are processed or "reconstructed" to provide coherent information about individual physics objects such as electrons or particle jets.
    • Simulation data
    • Examples of simplified datasets derived from the primary ones for use in different applications and analyses
  • Tools
    • A downloadable Virtual Machine (VM) image with the CMS software environment through which the datasets can be accessed
    • An analysis example chain, reading the primary dataset and producing intermediate derived data for the final analysis
    • Downloadable Docker images with the CMS software environment
    • Ready-to-use online applications, such as an event display and simple histogramming software
    • Source code for the various examples and applications, available in the CMS software collection

Primary and simulated datasets

  • Collision data in the primary datasets are in a format known as AOD or Analysis Object Data, while simulated data are in a format called AODSIM.
  • AOD/AODSIM files contain the information that is needed for analysis:
    • all the high-level physics objects (such as muons, electrons, etc.);
    • tracks with associated hits, calorimetric clusters with associated hits, vertices; and
    • information about event selection (triggers), data needed for further selection and identification criteria for the physics objects.
  • The file is not the final event interpretation with a simple list of particles.
    • It contains several instances of the same physics object (i.e. a jet reconstructed with different algorithms).
    • It may have double-counting (i.e. a physics object may appear as a single object of its own type, but it may also be part of a jet).
    • Additional knowledge is needed to define a "good" physics object.
    • Definition of same objects is different in each analysis.
  • Some datasets, such as those containing heavy-ion data, are provided in a format called RECO, which contains more information than the AOD format. This is done when the original analyses by the CMS collaboration were performed using this particular format.
  • Some simulated datasets are provided in the MiniAODSIM format, which is the format used in physics analysis starting from Run2 (2015):
    • MiniAOD/MiniAODSIM is approximately one tenth of the size of AOD/AODSIM.
    • The reduction is obtained defining light-weight physics-object candidate representations, increasing transverse momentum thresholds for storing physics-object candidates, and reduced numerical precision when it is not required at the analysis level.
    • More information on the MiniAOD format
      • Mini-AOD: A New Analysis Data Format for CMS G Petrucciani, A Rizzi, C Vuosalo on behalf of the CMS Collaboration
      • MiniAOD analysis documentation
  • The files can be read in ROOT, but they cannot be opened (and understood) as simple data tables.
  • Only the runs that are validated by data quality monitoring should be used in any analysis. The list of the validated runs is provided.
  • A small sample of raw data is also provided.

Disclaimer

  • The open data are released under the Creative Commons CC0 waiver. Neither CMS nor CERN endorses any works, scientific or otherwise, produced using these data, even if available on, or linked from, this portal.
  • All datasets will have a unique DOI that you are requested to cite in any applications or publications.
  • Despite being processed, the high-level primary datasets remain complex and selection criteria need to be applied in order to analyse them, requiring some understanding of particle physics and detector functioning. The data cannot be viewed in simple data tables for spreadsheet-based analyses.
  • No further development is foreseen for either the data released or the software version needed to analyse them.
    • The methods have evolved since the released data were recorded.
    • More advanced techniques are used with recent data but the software is not compatible out-of-the-box with older data samples.
  • The release of 2010 data is accompanied by a small set of simulated data, and the release of 2011 data includes some simulated data. However, these are not a full set of simulations, but only those datasets that have been reprocessed with a software release compatible with the respective collision data. The release of 2012 data includes a larger sample of simulated data. A part of simulated datasets is released with the bibliographic information content only, and they will be made available online on demand.
  • If you are interested in joining the CMS Collaboration, please contact nearest CMS university/institute

Other CMS open data

  • All CMS publications are open access.
  • Some of the papers also include open data in the form of additional tables, plots, graphs and Rivet packages.

Policies

  • Data preservation and open access policy
  • Papers by CMS members using public data [internal]
  • © CERN, 2014–2021 ·
  • Terms of Use ·
  • Privacy Policy ·
  • Help ·
  • GitHub ·
  • Twitter ·
  • Email