Data management checklist



Data management checklist
Creative Commons CC0

Numerous aspects of research data management should be considered and addressed at the beginning of the research. Use this checklist to follow best practices for your data management plan.



  • Who is responsible for which aspects of data management?
  • Are new skills required for certain activities?
  • Will you need additional resources to manage data such as staff, time or hardware?
  • Have you accounted for the cost of long-term data preservation and access?



  • Will others be able to understand, re-use and benefit from your data?
  • Are the abbreviations, codes and variables of your structured data self-explanatory?
  • Which contextual documentation explains what your data means, how it was collected, and the methods used to create it?
  • How will you label and organise data, records and files?
  • How will you ensure that your data is consistently catalogued?



  • Do you use standardized and consistent procedures to collect, process, transcribe, check, validate and verify data, such as standard protocols, templates or input forms?
  • Which data formats will you use? Do formats and software enable the sharing and long-term sustainability of data, such as non-proprietary software or software based on open standards?
  • How do you ensure that no data, annotations, or internal metadata will be lost or changed when converting data between different formats?



  • Are your digital and non-digital data stored in multiple secure locations?
  • If data is stored in multiple locations, how will you track versions?
  • Do you need to ensure the secure storage of personal or sensitive information? How will you protect your data?
  • When collecting data with mobile devices, how will you transfer and store the data?
  • Are your files backed up regularly and are backups stored safely?
  • How will you know which version of your data files is the master?
  • Who has access to which data during and after the search? Is there a need for access restrictions? How will they be managed in the long term?
  • How long will you save your data and will you need to select what data should be kept and what data should be destroyed?


Confidentiality, ethics and consent

  • Does your data contain sensitive or confidential information? If so, have you discussed data sharing with the respondents from whom you collected the data?
  • Do you have written consent from respondents to share data beyond your research?
  • Do you need to anonymize data, such as removing personal information, during the research process or in preparation to share data?



  • Who owns the copyright to your data? Could there be joint copyright?
  • What type of license is appropriate for sharing your data, and what restrictions may apply to reuse?
  • If you’re reusing other researchers’ data sources, have you considered how to share that data, such as negotiating a new license?


Data sharing

  • How and where will you preserve your research data for the long term?
  • Do you intend to make your data available for sharing? How will you choose which data should be shared and which (possibly) not?
  • When will you make your data available for re-use? Will you require an embargo period?
  • Are data quality assurance processes described?


Making data accessible

  • How will you make your data accessible to future users (e.g. by deposition in a repository)?
  • Have you explored appropriate arrangements with the identified repository?
  • Will the data produced and/or used in the project discoverable with metadata, identifiable and locatable by means of a standard identification mechanism (e.g. persistent and unique identifiers such as Digital Object Identifiers)?


Making data interoperable

  • Are the data produced in the project interoperable, that is allowing data exchange and re-use between researchers, institutions, organisations, countries, etc. (i.e. adhering to standards for formats, as much as possible compliant with available (open) software applications, and in particular facilitating re-combinations with different datasets from different origins)?



Adapted from the UK Data Service and the ERC’s Horizon 2020 guidelines.