Data repositories are dedicated tools for depositing research data such that it can be easily accessed and cited by others, via a dedicated DOI (digital object identifier). Researchers are highly recommended to always deposit data in a dedicated repository instead of as “supplementary materials” alongside publications, for two very important reasons:
Fortunately, it is typically free and very easy to deposit into a data repository. There are essentially two different types of repository:
These repositories will accept a wide range of file formats and typically have a small number of meta-data fields for deposits to try and make the descriptors as general purpose as possible. OxShef: Publishers maintains an overview of the most popular general purpose repositories below.
General purpose repositories often have fairly large file size limits, can be used freely and provide API for programmatic access to data.
Specialist repositories are designed for specific types of deposit, for instance; GIS, genomic, proteinomic, metereological or archeological data. Datasets require highly detailed meta-data to be added before they can be published, as users of these repositories expect to be able to filter or access datasets based on these fields.
Scientific Data maintains an excellent guide to specialist repositories but you may wish to refer to re3data.org which is well regarded as the most exhaustive collection of (over 2,000) repositories.
It’s important to note that your home institution may require that you update their institutional repository with a “meta data record” that contains the DOI for your research outputs, including deposits with data repositories. Contact your local research support teams for advice.
Repository | Description | DOI Versioning Info | API info |
---|
Repository | Description | DOI Versioning Info | API info |
---|---|---|---|
![]() | Figshare is a very popular general purpose data repository owned by Digital Science. There are many publishers who have partnered with Figshare such that publications and supplementary materials are automatically deposited on Figshare, examples include Scientific Data, PLoS and American Chemical Society. OxShef: dataviz templates typically include instructions on how to connect to Figshare programmatically. | Individual DOI support versioning: 10.6084/m9.figshare/xxx.v1 10.6084/m9.figshare/xxx.v2 The canonical dataset is available at: 10.6084/m9.figshare/xxx | Fully documented API. Dedicated R package Connect to GitHub repositories. |
![]() | OSF (Open Science Framework) is a fairly new general purpose data repository that offers functionality somewhat similar to an electronic lab notebook. OSF provides a variety of collaboration features and connections to services like Mendeley, Dropbox and AWS that are simply not offered by the other respositories in this table. | OSF does not currenty support versioning (last updated February 2018) | Fully documented API. R package currently under development Connect to GitHub repositories. |
![]() | Zenodo is a popular general purpose data repository housed at CERN. Zenodo has put a lot of effort into supporting a wide range of meta data fields, and is particularly good for depositing code into. | Supports versioning as of late 2017.
Versioning creates new DOI which are semantically linked. Read more here. | Fully documented API. No maintained R package Connect to GitHub repositories. |