Q & A on Zenodo and EC Open Research Data

This blogpost is based on the write-up of the Q&A session after the webinar “Open Research Data in H2020 and Zenodo repository” on 26 October 2016. 

At OpenAIRE, we get a lot of questions about the Open Research Data Pilot that is run by the European Commission (this pilot will be extended to áll H2020 projects starting in 2017!).

Based on the questions we received after our last training session on Research Data Management during Open Access Week 2016, we have compiled a Q&A document where we deal with all questions asked by the audience after the webinar. Many thanks to Marjan Grootveld (DANS) and Krzysztof Nowak (CERN) for hosting the webinar and for their detailed responses to the questions.

We will continue to organise trainings, in collaboration with other relevant projects and organisations such as EUDAT, DCC, Zenodo and OpenMinTed. Stay tuned for more updates!

This blogpost is a summary of that document – consider it a first aid kit where you will find answers to some of your more urgent questions about the ORD Pilot, Data Management Plans and Zenodo!   

Questions about the Open Research Data Pilot in Horizon 2020!

fair

  • What kind of data should we be depositing?
    • You need to deposit at least the data that underlie your publications. The requirements for projects in the ORD Pilot are are set out in article 29.3 of the Model Grant Agreement. In summary, projects must  
      • 1) deposit in a research data repository and take measures to make it possible for third parties to access, mine, exploit, reproduce and disseminate — free of charge for any user — the following:
        • a. the data, including associated metadata, needed to validate the results presented in scientific publications as soon as possible;
        • b. other data, including associated metadata, as specified and within the deadlines laid down in the data management plan;
      • 2) provide information — via the repository — about tools and instruments at the disposal of the beneficiaries and necessary for validating the results (and — where possible — provide the tools and instruments themselves).
  • Can I opt out?
    • Participating in the ORD Pilot does not necessarily mean opening up all your research data. Rather, the ORD pilot follows the principle “as open as possible, as closed as necessary” and focuses on encouraging sound data management as an essential part of research best practice.  
    • According to the Guidelines for FAIR data management in Horizon 2020 the Commission provides robust opt-out possibilities: during the application phase, during the grant agreement preparation (GAP) phase, and after the signature of the grant agreement.  
  • Who will pay for data management and accessibility after project?
    • Costs associated with open access to research data can be claimed as eligible costs of any Horizon 2020 grant, according to the Guidelines.
    • Note however that:
      • 1) the budget must be granted and
      • 2) the reimbursement must be claimed during the duration of the project
    • For long-term availability of the data this implies that ideally you should contact a suitable data archive already in the proposal stage to find out what they may charge.

Furthermore, a rule of thumb – ignoring all differences between projects and disciplines – is to budget 5% of the project budget for data management activities.

Questions about Data Management Plans!

  • A single DMP is expected, but what if I manage a large action with several research projects involving different data?
    • You can be as specific as you want in your DMP – it is possible to describe several projects and types of data in one DMP. You can describe what the projects have in common (e.g. a domain metadata schema) and where there are specific issues for individual projects and their datasets (e.g. regarding openness), you should clearly spell this out. All sorts of information can be addressed in a DMP!
  • Does the European Commission provide a template that we can use for our DMP?
  • Does the Commission plan to organise DMP trainings? Without having in-house expertise, it is difficult to have a proper DMP without guidance and support!

ze

Questions about Zenodo!

  • How is Zenodo funded? Who pays? What does it offer? Is it sustainable?
    • Zenodo staff time is funded by OpenAIRE and CERN. The infrastructure is funded by CERN. OpenAIRE funding currently runs out in 2017. CERN has a five year rolling plan, longer term we are looking at cloud-credit schemes with funders.
    • Zenodo provides long-term bit-level preservation. We do not provide format migration. Zenodo internal workflows are following the Open Archival Information System reference model. Data in Zenodo is stored in CERN’s Data Center in the same storage system as our the CERN High Energy Physics data. Zenodo is in the process of preparing the Data Seal of Approval application. In addition, CERN is working towards ISO certification of the organisational and technical infrastructure which Zenodo relies on for the purpose of long-term preservation of High Energy Physics data.
  • Does the European Commission obliges projects to use Zenodo for research data, or can alternatives be used?
  • Why can’t I just use OpenAIRE to store all my research output?
    • H2020 allows you to deposit publications and data in any suitable repository. This can be institutional, subject-based or a generic repository like Zenodo. Zenodo is provided as a catch-all repository in case you do not have access to an existing repository. OpenAIRE on the other hand is not a repository but an aggregator. OpenAIRE harvests some 6000+ repositories and makes their metadata available, however the underlying digital files are still hosted in the original repository.
  • Let’s say I have uploaded a paper in Zenodo and then I upload a dataset relevant to this paper. Is there a way to link those two in Zenodo? What metadata is required?
    • Yes, you can link Zenodo records with any resource having an identifier (this includes content outside of Zenodo) using the metadata field “Related/alternate identifiers”. The links are also available in our exported metadata in a machine readable form. Using a paper and data records that are both stored on Zenodo as an example, the paper record would specify the data record by it’s DOI with a relationship “is a supplement to this upload”. Similarly, the data record would specify the DOI of the paper with a relationship “is supplemented by this upload”.  
    • In addition to the grant you must provide basic bibliographic metadata such as title, authors, publication date as well as the embargo period if applicable. However, it is essential to provide as much accurate metadata as possible, as rich metadata significantly improves the data’s findability and re-usability.
  • What levels of accessibility does Zenodo offer?
    • You cannot apply different levels of accessibility in one record.
    • Metadata for both open, closed, embargoed and restricted records are always publicly available in Zenodo. Data files for restricted access records are only visible by their owners and to those the owner grants access.
    • Restricted access allows a researcher to upload a dataset and provide the conditions under which he/she grants access to the data. Researchers wishing to request access must provide a justification for how they fulfill these conditions. The owner of the dataset gets notified for each new request, and can decide to either accept or reject the request. If the request is accepted, the requestor receives a secret link which usually expires within 1-12 months.
  • What kind of data does Zenodo accept? I generate terabytes of data every week!
    • Bear in mind that storing data (during a project) is not the same as archiving data (near the end of a project): e.g. access rights, file formats and the persons or organisations which are responsible for data management may be different in these stages. “Created every week” sounds as if you might need a good storage solution such as EUDAT’s B2SAFE, plus selection criteria to help decide if all data (raw, processed and final) should be archived.
  • What about evolving research databases? We are aiming at creating an online database to be updated throughout the project.
    • You can upload snapshots of the live database to Zenodo. This is primarily useful in cases where you publish an article and need an exact copy of the database in a given point in time so you can cite it. Without the snapshot your article will reference a database which might no longer support the findings of your article as the data may have evolved.

Gwen Franck

Open Access Programme Coordinator at EIFL - Electronic Information for Libraries / Open Access Project Officer at LIBER - Association of European Research Libraries

More Posts - Twitter

Tags: , ,

Post a Reply

Your email address will not be published. Required fields are marked *

Top