Home Fundamentals Research Data Management FAIR Data Principles Metadata Ontologies Data Sharing Data Publications Data Management Plan Version Control & Git Public Data Repositories Persistent Identifiers Electronic Lab Notebooks (ELN) DataPLANT Implementations Annotated Research Context User Journey ARC specification ARC Commander QuickStart QuickStart (Experts) Swate QuickStart Walk-through Best Practices For Data Annotation DataHUB DataPLAN Ontology Service Landscape ARC Commander Manual Setup Git Installation ARC Commander Installation Windows MacOS Linux ARC Commander DataHUB Access Before we start Central Functions Initialize Clone Connect Synchronize Configure Branch ISA Metadata Functions ISA Metadata Investigation Study Assay Update Export ARCitect Manual Installation - Windows Installation - macOS Installation - Linux QuickStart QuickStart - Videos ARCmanager Manual What is the ARCmanager? How to use the ARCmanager Swate Manual Annotation tables Building blocks Building Block Types Adding a Building Block Filling cells with ontology terms Advanced Term Search File Picker Templates Contribute Templates ISA-JSON DataHUB Manual Overview User Settings Generate a Personal Access Token (PAT) Projects Panel ARC Panel Forks Working with files ARC Settings ARC Wiki Groups Panel Create a new user group Data publications Passing Continuous Quality Control Submitting ARCs with ARChigator Track publication status Use your DOIs Guides ARC User Journey Create your ARC ARC Commander QuickStart ARC Commander QuickStart (Experts) ARCitect QuickStart Annotate Data in your ARC Annotation Principles ISA File Types Best Practices For Data Annotation Swate QuickStart Swate Walk-through Share your ARC Register at the DataHUB DataPLANT account Invite collaborators to your ARC Sharing ARCs via the DataHUB Work with your ARC Using ARCs with Galaxy Computational Workflows CWL Introduction CWL runner installation CWL Examples CWL Metadata Recommended ARC practices Syncing recommendation Keep files from syncing to the DataHUB Working with large data files Adding external data to the ARC ARCs in Enabling Platforms Publication to ARC Troubleshooting Git Troubleshooting Contribute Swate Templates Knowledge Base Teaching Materials Events 2023 Nov: CEPLAS PhD Module Oct: CSCS CEPLAS Start Your ARC Sept: MibiNet CEPLAS Start Your ARC July: RPTU Summer School on RDM July: Data Steward Circle May: CEPLAS Start Your ARC Series Start Your ARC Series - Videos Events 2024 CEPLAS ARC Trainings – Spring 2024 MibiNet CEPLAS DataPLANT Tool-Workshops Frequently Asked Questions

Data Publications

last updated at 2022-05-19

Publishing research data allows others to access and use your data. Writing a manuscript can consume a lot of time. Some researchers might find this process tedious if they only want to publish certain data, which they considered as interesting or impactful during and after collection. Data publishing is an integral part of the open science movement. In general, the main goal of data publishing is to evolve data to first class research outputs, driven by a number of initiatives. This enables datasets to be cited similarly to other research publication types, such as articles or books, enabling producers of datasets to gain academic credit for their work.

Benefits of data publications

The motivations for publishing data may range from a desire to make research more accessible, making datasets citable, or research funders or publishers require open data publishing. Some scientists might argue that they would feel uncomfortable about publishing their dataset, as it could allow people to use their work from the web and extract novelties out of it. However, most print-based science journals are available online nowadays, so the potential of exploiting is already present. Additionally, solutions to preserve privacy within data publishing has been proposed, including privacy protection algorithms, data ”masking” methods, and regional privacy level calculation algorithm. In general, the advantages of data publications prevail. Here is a list of some potential benefits you might get from publishing your dataset:

Criteria for data publications

There are a several criteria to consider during publication of your dataset:

  1. Of course, your data needs to be hosted in a repository to make it available for everyone. Various repositories exist, which have been developed to support data publication, e.g. Zenodo, including general, but also domain-specific data repositories exist.
  2. Your dataset needs to be well annotated, allowing other researchers to understand and reuse your data (see also metadata).
  3. Your dataset needs to be assigned a persistent identifier (PID), such as a DOI. This can be assigned directly on the repository or with the help of a publication service, such as Invenio. The identifier will others to cite your dataset.
  4. If the publisher validates your data, your metadata annotation is reviewed to ensure comprehensibility. There is also the possibility for publishing a data paper about the dataset, which may be published as a preprint, in a journal, or in a data journal that is dedicated to supporting data papers. The data may be hosted by the journal or hosted separately in a data repository.

Data publishing

Figure 1: During publication, datasets are typically deposited in a repository to make them available, documented to support reproduction and reuse, and assigned an identifier to facilitate citation. Some, but not all, publishers review datasets to validate them.

Data papers & data journals

Data papers or data articles are “scholarly publications of a searchable metadata document describing a particular on-line accessible dataset, or a group of datasets, published in accordance to the standard academic practices”. The intent of a data paper is to offer a descriptive information on the related dataset(s) focusing on data collection and distinguishing features, rather than on data processing and analysis. Thereby, their aim is answering questions like “What data was published?”, “How was the data collected?”, or “Who collected the data?”. As data papers are considered academic publications, just as other types of papers, they allow scientists sharing data to receive credit and thus, upgrading the value of data sharing. This provides not only an additional incentive to share data, but also increases metadata quality and reusability of the shared data.

Data papers are supported by a variety of journals, of which some are “true” data journals, i.e. they are dedicated to publishing data papers only, while the majority are mixed journals meaning they publish a number of article types, including data papers. A comprehensive list of data journals for different domains can be found here.

How does DataPLANT support me in Data publication?

The following table gives an overview about DataPLANT tools and services supporting you in data publishing. Follow the link in the first column for details.

Name Type Tasks on metadata
DataHUB Service Share:
  • Federated system to share ARCs
  • Manage who can view or access your ARC
Invenio Service under construction Share:
  • Assign a DOI to an ARC
Metadata registry Service under construction Share:
  • Find ARC (meta)data
Converters Tool under construction Curate:
  • Harmonize and migrate between metadata schema
Sources and further information

DataPLANT Support

Besides these technical solutions, DataPLANT supports you with community-engaged data stewardship. For further assistance, feel free to reach out via our helpdesk or by contacting us directly .
Contribution Guide 📖
✏️ Edit this page