The CERN Open Data portal is the access point to a growing range of data produced through the research performed at CERN. It disseminates the preserved output from various research activities, including accompanying software and documentation which is needed to understand and analyse the data being shared.
The portal adheres to established global standards in data preservation and Open Science: the products are shared under open licenses; they are issued with a digital object identifier (DOI) to make them citable objects in the scientific discourse (see details below on how to do this).
Data produced by the LHC experiments are usually categorised in four different levels (DPHEP Study Group (2009)). The Open Data portal focuses on the release of data from level 2 and 3.
All four LHC experiments have approved data preservation and access policies which state that they will make their data (except level 4 data) available. New data will enter the portal once the embargo periods for them are over. For detailed information regarding embargo periods, accessibility and preservation of LHC data, please refer to the experiments data policies.
In support of these data policies, this portal publishes and preserves data from level 2 and 3, such as simplified formats and fully reconstructed events, together with associated software and documentation needed to access and use the data.
All datasets and other material available in this portal are minted with a persistent identifier, a so called DOI (Digital Object Identifier) that allows permanent linking to the records. The CERN Open Data Portal endorses the FORCE 11 Joint Declaration of Data Citation Principles. Thus, we ask you to cite the data provided in the portal when you re-use them. To make this easier for you, we provide you with a citation recommendation for every dataset as well as output formats (e.g. BibTex) for common reference programs. Citing datasets in the reference list of your paper will allow other platforms such as INSPIRE to track citations to these datasets and measure their impact.
This portal is built around the following technologies:
Invenio is a library management software that allows to run your own digital library, institutional repository, multimedia archive, or research data repository on the web. The technology offered by the software covers all aspects of digital library management, from document ingestion through classification, indexing, and curation up to document dissemination. The flexible data model uses JSON schema to describe articles and various other types of media and supports popular master metadata formats, such as MARC21 as its underlying bibliographic format. Invenio features a powerful search engine that handles repositories of up to several million records with ease. Being composed of a multitude of independent pluggable packages, the digital library framework provides the needed flexibility for a wide range of applications. Invenio is a strong advocate of open standards and open access and makes use of DOI, Memento, OAI-PMH, ORCID, OpenAIRE and many others.
CernVM is a baseline Virtual Software Appliance for the participants of CERN LHC experiments. The Appliance represents a complete, portable and easy to configure user environment for developing and running LHC data analysis locally and on institutional and commercial clouds (OpenStack, Amazon EC2, Google Compute Engine), independently of Operating System software and hardware platform (Linux, Windows, MacOS). The goal is to remove a need for the installation of the experiment software and to minimise the number of platforms (compiler-OS combinations) on which experiment software needs to be supported and tested.
EOS is a disk-based service providing a low latency storage infrastructure for physics users. EOS provides a highly-scalable hierarchical namespace implementation. Data access is provided by the XROOT protocol.
The main target area for the service are physics data analysis use often cases characterised by many concurrent users, a significant fraction random data access and a large file open rate.
The portal is a collaborative effort of the CERN groups IT-CIS and RCS-SIS in collaboration with many researchers in the High-Energy Physics community. If you want to contact us for any request or submission please send us an email to firstname.lastname@example.org