Data Management Plan of the H2020 Project $_PROJECT

Action Number:	$_FUNDINGPROGRAMME
Action Acronym:	$_PROJECT
Action Title:	$_PROJECT
Creation Date:	$_CREATIONDATE
Modification Date:	$_MODIFICATIONDATE
DMP version:	$_DMPVERSION

1 Introduction

#if$_EU The $_PROJECT is part of the Open Data Initiative (ODI) of the EU. #endif$_EU To best profit from open data, it is necessary not only to store the data but to make it Findable, Accessible, Interoperable, and Reusable (FAIR).#if$_PROTECT We support open and FAIR data, however, we also consider the need to protect individual data sets. #endif$_PROTECT

The aim of this document is to provide guidelines on the principles of data management in the $_PROJECT and to specify which type of data will be stored, this will be achieved by using the responses to the EU questionnaire on Data Management Plan (DMP) as a DMP document.

The detailed DMP states how data will be handled during and after the project. The $_PROJECT DMP is prepared according to the Horizon 2020 and Horizon Europe online manual. #if$_UPDATE It will be updated/its validity checked during the $_PROJECT project several times. At the very least, this will happen at month $_UPDATEMONTH. #endif$_UPDATE

2 Data Management Plan EU Template

2.1 Data Summary

What is the purpose of the data collection/generation and its relation to the objectives of the project?

The $_PROJECT has the following aim: $_PROJECTAIM. Therefore, data collection#if!$_VVISUALIZATION and integration #endif!$_VVISUALIZATION#if$_VVISUALIZATION, integration and visualization #endif$_VVISUALIZATION #if$_DATAPLANT using the DataPLANT ARC structure are absolutely necessary #endif$_DATAPLANT #if!$_DATAPLANT through a standardized data management process is absolutely necessary #endif!$_DATAPLANT because the data are used not only to understand principles, but also be informed about the provenance of data analysis information. Stakeholders must also be informed about the provenance of data. It is therefore necessary to ensure that the data are well generated and also well annotated with metadata using open standards, as laid out in the next section.

What types and formats of data will the project generate/collect?

The $_PROJECT will collect and/or generate the following types of raw data : $_GENETIC, $_GENOMIC, $_TRANSCRIPTOMIC, $_RNASEQ, $_METABOLOMIC, $_PROTEOMIC, $_PHENOTYPIC, $_TARGETED, $_IMAGE, $_MODELS, $_CODE, $_EXCEL, $_CLONED-DNA data which are related to $_STUDYOBJECT. In addition, the raw data will also be processed and modified using analytical pipelines, which may yield different results or include ad hoc data analysis parts. #if$_DATAPLANT These pipelines will be tracked in the DataPLANT ARC.#endif$_DATAPLANT Therefore, care will be taken to document and archive these resources (including the analytical pipelines) as well#if$_DATAPLANT relying on the expertise in the DataPLANT consortium#endif$_DATAPLANT.

Will you re-use any existing data and how?

The project builds on existing data sets and relies on them. #if$_RNASEQ|$_GENOMIC For example, without a proper genomic reference it is very difficult to analyze next-generation sequencing (NGS) data sets.#endif$_RNASEQ|$_GENOMIC It is also important to include existing data-sets on the expression and metabolic behavior of the $_STUDYOBJECT, and on existing background knowledge. #if$_PARTNERS of the partners. #endif$_PARTNERS Genomic references can be gathered from reference databases for genomes/ and sequences, like the US National Center for Biotechnology Information: NCBI, European Bioinformatics Institute: EBI; DNA Data Bank of Japan: DDBJ. Furthermore, prior 'unstructured' data in the form of publications and data contained therein will be used for decision making.

What is the origin of the data?

Public data will be extracted as described in the previous paragraph. For the $_PROJECT, specific data sets will be generated by the consortium partners.

Data of different types or representing different domains will be generated using unique approaches. For example:

Genetic data will be generated targeting crosses and breeding experiments, and will include recombination frequencies and crossover event that position genetic markers and quantitative trait loci that can be associated with physical genomic markers/variants.
Genomic data will be created from sequencing data, which will be processed to identify genes, regulatory elements, transposable elements, and physical markers such as SNPs, microsatellites and structural variants.
The origin and assembly of cloned DNA will include (a) source of original vector sequence with Add gene reference where available, and source of insert DNA (e.g., amplification by PCR from a given sample, or obtained from existing library), (b) cloning strategy (e.g., restriction endonuclease digests/ligation, PCR, TOPO cloning, Gibson assembly, LR recombination), and (c) verified DNA data sequence of final recombinant vector.
Methods of transcriptomics data collection will be selected from microarrays, quantitative PCR, Northern blotting, RNA immunoprecipitation, fluorescence in situ hybridization. RNA-Seq data will be collected in seperate methods.
RNA sequencing will be generated using short-read or long-read plantforms, either in house or outsourced to academic facilities or commercial services, and the raw data will be processed using estabilished biofirmatics piplines.
Metabolomic data will be generated by coupled chromatography and mass spectrometry using targeted or untargeted approaches.
Proteomic data will be generated using coupled chromatography and mass spectrometry for the analysis of protein abundance and protein identification, as well as additional techniques for structural analysis, the identification of post-translational modifications and the characterization of protein interactions.
Phenotypic data will be generated using phenotyping platforms and corresponding ontologies, including number/size of organs such as leaves, flowers, buds etc., size of whole plant, stem/root architecture (number of lateral branches/roots etc), organ structures/morphologies, quantitative metrics such as color, turgor, health/nutrition indicators, among others.
Targeted assays data (e.g. glucose and fructose concentrations or production/ultilization rates) will be generated using specific equipment and methods that are fully documented in the laboratory notebook.
Image data will be generated by equipment such as cameras, scanners, and microscopes combined with software. Original images which contain metadata such as exif photo information will be archived.
Model data will be generated by using software simulations. The complete workflow, which includes the environment, runtime, parameters, and results, will be documented and archived.
Computer code will be produced by programmers.
Excel data will be generated by data analysts by using MS Office or open-source software.

#if$_PREVIOUSPROJECTS

Data from previous projects such as $_PREVIOUSPROJECTS will be considered.

#endif$_PREVIOUSPROJECTS

What is the expected size of the data?

We expect to generate $_RAWDATA GB of raw data and up to $_DERIVEDDATA GB of processed data.

To whom might it be useful ('data utility')?

The data will initially benefit the $_PROJECT partners, but will also be made available to selected stakeholders closely involved in the project, and then the scientific community working on $_STUDYOBJECT. $_DATAUTILITY In addition, the general public interested in $_STUDYOBJECT can also use the data after publication. The data will be disseminated according to the $_PROJECT's dissemination and communication plan, #if$_DATAPLANT which aligns with DataPLANT platform or other means#endif$_DATAPLANT

2.2 FAIR data

Making data findable, including provisions for metadata

Are the data produced and/or used in the project discoverable with metadata, identifiable and locatable by means of a standard identification mechanism (e.g. persistent and unique identifiers such as Digital Object Identifiers)?

All datasets will be associated with unique identifiers and will be annotated with metadata. We will use Investigation, Study, Assay (ISA) specification for metadata creation. The $_PROJECT will rely on community standards plus additional recommendations applicable in the plant science, such as the #if$_PHENOTYPIC #if$_MIAPPE MIAPPE (Minimum Information About a Plant Phenotyping Experiment),#endif$_MIAPPE #endif$_PHENOTYPIC #if$_GENOMIC|$_GENETIC #if$_MIXS MIxS (Minimum Information about any (X) Sequence),#endif$_MIXS #if$_MIGSEU MigsEu (Minimum Information about a Genome Sequence: Eucaryote),#endif$_MIGSEU #if$_MIGSORG MigsOrg (Minimum Information about a Genome Sequence: Organelle),#endif$_MIGSORG #if$_MIMS MIMS (Minimum Information about Metagenome or Environmental),#endif$_MIMS #if$_MIMARKSSPECIMEN MIMARKSSpecimen (Minimal Information about a Marker Specimen: Specimen),#endif$_MIMARKSSPECIMEN #if$_MIMARKSSURVEY MIMARKSSurvey (Minimal Information about a Marker Specimen: Survey),#endif$_MIMARKSSURVEY #if$_MISAG MISAG (Minimum Information about a Single Amplified Genome),#endif$_MISAG #if$_MIMAG MIMAG (Minimum Information about Metagenome-Assembled Genome),#endif$_MIMAG #endif$_GENOMIC|$_GENETIC #if$_TRANSCRIPTOMIC #if$_MINSEQE MINSEQE (Minimum Information about a high-throughput SEQuencing Experiment),#endif$_MINSEQE #endif$_TRANSCRIPTOMIC #if$_TRANSCRIPTOMIC #if$_MIAME MIAME (Minimum Information About a Microarray Experiment),#endif$_MIAME #endif$_TRANSCRIPTOMIC #if$_IMAGE #if$_REMBI REMBI (Recommended Metadata for Biological Images),#endif$_REMBI #endif$_IMAGE #if$_PROTEOMIC #if$_MIAPE MIAPE (Minimum Information About a Proteomics Experiment),#endif$_MIAPE #if$_MIMIX MIMix (Minimum Information about any (X) Sequence),#endif$_MIMIX #endif$_PROTEOMIC These specific standard unlike cross-domain minimal sets such as the Dublin core, which mostly define the submitter and the general type of data, allow reusability by other researchers by defining properties of the plant (see the preceding section). However, minimal cross-domain annotations #if$_DUBLINCORE Dublin Core,#endif$_DUBLINCORE #if$_MARC21 MARC 21,#endif$_MARC21 also remain part of the $_PROJECT. #if$_DATAPLANT The core integration with DataPLANT will also allow individual releases to be tagged with a Digital Object Identifier (DOI). #endif$_DATAPLANT #if$_OTHERSTANDARDS Other standards such as $_OTHERSTANDARDINPUT are also adhered to. #endif$_OTHERSTANDARDS

What naming conventions do you follow?

Data variables will be allocated standard names. For example, genes, proteins and metabolites will be named according to approved nomenclature and conventions. These will also be linked to functional ontologies where possible. Datasets will also be named I a meaningful way to ensure readability by humans. Plant names will include traditional names, binomials, and all strain/cultivar/subspecies/variety identifiers.

Will search keywords be provided that optimize possibilities for re-use?

Keywords about the experiment and consortium will be included, as well as an abstract about the data, where useful. In addition, certain keywords can be auto-generated from dense metadata and its underlying ontologies. #if$_DATAPLANT Here, DataPLANT strives to complement these with standardized DataPLANT ontologies that are provided where the ontology does not yet include such variables. #endif$_DATAPLANT

Do you provide clear version numbers?

To maintain data integrity and facilitate reanalysis, data sets will be allocated version numbers where this is useful (e.g. raw data must not be changed and will not get a version number and is considered immutable). #if$_DATAPLANT This is automatically supported by the ARC Git DataPLANT infrastructure. #endif$_DATAPLANT

What metadata will be created? In case metadata standards do not exist in your discipline, please outline what type of metadata will be created and how.

We will use Investigation, Study, Assay (ISA) specification for metadata creation. #if$_RNASEQ|$_GENOMIC For specific data (e.g., RNASeq or genomic data), we use metadata templates from the end-point repositories. #if$_MINSEQE The Minimum Information About a Next-generation Sequencing Experiment (MinSEQe) will also be used. #endif$_MINSEQE #endif$_RNASEQ|$_GENOMIC The following metadata/ minimum informatin standards will be used to collect metadata: #if$_GENOMIC|$_GENETIC #if$_MIXS MIxS (Minimum Information about any (X) Sequence),#endif$_MIXS #if$_MIGSEU MigsEu (Minimum Information about a Genome Sequence: Eucaryote),#endif$_MIGSEU #if$_MIGSORG MigsOrg (Minimum Information about a Genome Sequence: Organelle),#endif$_MIGSORG #if$_MIMS MIMS (Minimum Information about Metagenome or Environmental),#endif$_MIMS #if$_MIMARKSSPECIMEN MIMARKSSpecimen (Minimal Information about a Marker Specimen: Specimen),#endif$_MIMARKSSPECIMEN #if$_MIMARKSSURVEY MIMARKSSurvey (Minimal Information about a Marker Specimen: Survey),#endif$_MIMARKSSURVEY #if$_MISAG MISAG (Minimum Information about a Single Amplified Genome),#endif$_MISAG #if$_MIMAG MIMAG (Minimum Information about Metagenome-Assembled Genome),#endif$_MIMAG #endif$_GENOMIC|$_GENETIC #if$_TRANSCRIPTOMIC #if$_MINSEQE MINSEQE (Minimum Information about a high-throughput SEQuencing Experiment),#endif$_MINSEQE #endif$_TRANSCRIPTOMIC #if$_TRANSCRIPTOMIC #if$_MIAME MIAME (Minimum Information About a Microarray Experiment),#endif$_MIAME #endif$_TRANSCRIPTOMIC #if$_IMAGE #if$_REMBI REMBI (Recommended Metadata for Biological Images),#endif$_REMBI #endif$_IMAGE #if$_PROTEOMIC #if$_MIAPE MIAPE (Minimum Information About a Proteomics Experiment),#endif$_MIAPE #if$_MIMIX MIMix (Minimum Information about any (X) Sequence),#endif$_MIMIX #endif$_PROTEOMIC #if$_METABOLOMIC #if$_METABOLIGHTS Metabolights submission compliant standards will be used for metabolomic data where this is acccepted by the consortium partners.#issuewarning some Metabolomics partners considers Metabolights not an accepted standard.#endissuewarning #endif$_METABOLIGHTS #endif$_METABOLOMIC As a part of plant research community, we use #if$_MIAPPE MIAPPE for phenotyping data in the broadest sense, but we will also be rely on #endif$_MIAPPE specific SOPs for additional annotations #if$_DATAPLANT that consider advanced DataPLANT annotation and ontologies. #endif$_DATAPLANT

Making data openly accessible

Which data produced and/or used in the project will be made openly available as the default? If certain datasets cannot be shared (or need to be shared under restrictions), we explain why, clearly separating legal and contractual reasons from voluntary restrictions.

By default, all data sets from the $_PROJECT will be shared with the community and made openly available. However, before the data are released, all will be provided with an opportunity to check for potential IP (according to the consortium agreement and background IP rights). #if$_INDUSTRY This applies in particular to data pertaining to the industry. #endif$_INDUSTRY IP protection will be prioritized for datasets that offer the potential for exploitation.

Note that in multi-beneficiary projects it is also possible for specific beneficiaries to keep their data closed if relevant provisions are made in the consortium agreement and are in line with the reasons for opting out.

How will the data be made accessible (e.g. by deposition in a repository)?

Data will be made available via the $_PROJECT platform using a user-friendly front end that allows data visualization. Besides this it will be ensured that data which can be stored in international discipline related repositories which use specialized technologies:

#if$_GENETIC For genetic data: #if$_GENBANK NCBI-GenBank,#endif$_GENBANK #if$_SRA NCBI-SRA,#endif$_SRA #if$_ENA EBI-ENA,#endif$_ENA #if$_ARRAYEXPRESS EBI-ArrayExpress,#endif$_ARRAYEXPRESS #if$_GEO NCBI-GEO,#endif$_GEO #endif$_GENETIC

#if$_TRANSCRIPTOMIC For Transcriptomic data: #if$_SRA NCBI-SRA,#endif$_SRA #if$_GEO NCBI-GEO,#endif$_GEO #if$_ARRAYEXPRESS EBI-ArrayExpress,#endif$_ARRAYEXPRESS #endif$_TRANSCRIPTOMIC

#if$_IMAGE For image data: #if$_BIOIMAGE EBI-BioImage Archive,#endif$_BIOIMAGE #if$_IDR IDR (Image Data Resource),#endif$_IDR #endif$_IMAGE

#if$_METABOLOMIC For metabolomic data: #if$_METABOLIGHTS EBI-MetaboLights,#endif$_METABOLIGHTS #if$_METAWORKBENCH Metabolomics Workbench,#endif$_METAWORKBENCH #if$_INTACT Intact (Molecular interactions),#endif$_INTACT #endif$_METABOLOMIC

#if$_PROTEOMIC For proteomics data: #if$_PRIDE EBI-PRIDE,#endif$_PRIDE #if$_PDB PDB (Protein Data Bank archive),#endif$_PDB #if$_CHEBI Chebi (Chemical Entities of Biological Interest),#endif$_CHEBI #endif$_PROTEOMIC

#if$_PHENOTYPIC For phenotypic data: #if$_edal e!DAL-PGP (Plant Genomics & Phenomics Research Data Repository) #endif$_edal #endif$_PHENOTYPIC

#if$_OTHEREP and $_OTHEREP will also be used to store data and the data will be processed there as well.#endif$_OTHEREP

For unstructured and less standardized data (e.g., experimental phenotypic measurements), these will be annotated with metadata and if complete allocated a digital object identifier (DOI). #if$_DATAPLANT Whole datasets will also be wrapped into an ARC with allocated DOIs. The ARC and the converters provided by DataPLANT will ensure that the upload into the endpoint repositories is fast and easy. #endif$_DATAPLANT

What methods or software tools are needed to access the data?

#if$_PROPRIETARY The $_PROJECT relies on the tool(s) $_PROPRIETARY. #endif$_PROPRIETARY

#if!$_PROPRIETARY No specialized software will be needed to access the data, just a modern browser. Access will be possible through web interfaces. For data processing after obtaining raw data, typical open-source software can be used. #endif!$_PROPRIETARY

#if$_DATAPLANT DataPLANT offers tools such as the open-source SWATE plugin for Excel, the ARC commander, arcCommander, and DataPLAN #endif$_DATAPLANT

Is documentation about the software needed to access the data included?

#if$_DATAPLANT DataPLANT resources are well described, and their setup is documented on a github project guide is provided on the GitHub project pages. #endif$_DATAPLANT All external software documentation will be duplicated locally and stored near the software.

Is it possible to include the relevant software (e.g. in open-source code)?

As stated above, the $_PROJECT will use publicly available open-source and well-documented certified software #if$_PROPRIETARY except for $_PROPRIETARY #endif$_PROPRIETARY.

Where will the data and associated metadata, documentation and code be deposited? Preference should be given to certified repositories that support open access, where possible.

As noted above, specialized repositories will be used for common data types. For unstructured and less standardized data (e.g., experimental phenotypic measurements), these will be annotated with metadata and if complete allocated a digital object identifier (DOI).#if$_DATAPLANT The Whole datasets will also be wrapped into an ARC with allocated DOIs.#endif$_DATAPLANT.

Have you explored appropriate arrangements with the identified repository?

The submission is for free, and it is the goal (at least of ENA) to obtain as much data as possible. Therefore, arrangements are neither necessary nor useful. Catch-all repositories are not required. #if$_DATAPLANT , and this has been confirmed for data associated with DataPLANT #endif$_DATAPLANT. #issuewarning if no data management platform such as DataPLANT is used, then you need to find appropriate repository to store or archive your data after publication. #endissuewarning

If there are restrictions on use, how will access be provided?

There are no restrictions beyond the IP screening described above, which is in line with European open data policies.

Is there a need for a data access committee?

There is no need for a data access committee.

Are there well described conditions for access (i.e. a machine-readable license)?

Yes, where possible, e.g. CC REL will be used for data not submitted to specialized repositories such as ENA.

How will the identity of the person accessing the data be ascertained?

Where data are shared only within the consortium, if the datasets are not yet finished or are undergoing IP checks, the data will be hosted internally and a username and password will be required for access (see GDPR rules). When the data are made public in EU or US repositories, completely anonymous access is normally allowed. This is the case for ENA as well and both are in line with GDPR requirements.

#if$_DATAPLANT Currently, data management relies on the annotated research context (ARC). It is password protected, so before any data or samples can be obtained, user authentication is required. #endif$_DATAPLANT

Making data interoperable

Are the data produced in the project interoperable, that is allowing data exchange and re-use between researchers, institutions, organizations, countries, etc. (i.e. adhering to standards for formats, as much as possible compliant with available (open) software applications, and in particular facilitating re-combinations with different datasets from different origins)?

Whenever possible, data will be stored in common and openly defined formats including all the necessary metadata to interpret and analyze data in a biological context. By default, no proprietary formats will be used. However Microsoft Excel files (according to ISO/IEC 29500-1:2016) might be used as intermediates by the consortium#if$_DATAPLANT and by some ARC components#endif$_DATAPLANT. In addition, text files might be edited in text processor files, but will be shared as pdf.

What data and metadata vocabularies, standards or methodologies will you follow to make your data interoperable?

As noted above, we foresee using minimal standards such as #if$_RNASEQ|$_GENOMIC #if$_MINSEQE MinSEQe for sequencing data and #endif$_MINSEQE #endif$_RNASEQ|$_GENOMIC Metabolights compatible forms for metabolites #if$_MIAPPE and MIAPPE for phenotyping-like data #endif$_MIAPPE. The minimal information standards will allow the integration of data across projects, and its reuse according to established and tested protocols. We will also use ontological terms to enrich the data sets relying on free and open ontologies where possible. Additional ontology terms might be created and canonized during the $_PROJECT.

Will you be using standard vocabularies for all data types present in your data set, to allow inter-disciplinary interoperability?

Open ontologies will be used where they are mature. As stated above, some ontologies and controlled vocabularies might need to be extended. #if$_DATAPLANT Here, the $_PROJECT will build on the advanced ontologies developed in DataPLANT. #endif$_DATAPLANT

In case it is unavoidable that you use uncommon or generate project specific ontologies or vocabularies, will you provide mappings to more commonly used ontologies?

Common and open ontologies will be used, so this question does not apply.

Increase data reuse (by clarifying licences)

How will the data be licensed to permit the widest re-use possible?

Open licenses, such as Creative Commons (CC), will be used whenever possible.

When will the data be made available for re-use? If an embargo is sought to give time to publish or seek patents, specify why and how long this will apply, bearing in mind that research data should be made available as soon as possible.

#if$_early Some raw data is made public as soon as it is collected and processed.#endif$_early #if$_beforepublication Relevant processed datasets are made public when the research findings are published.#endif$_beforepublication #if$_endofproject At the end of the project, all data without embargo period will be published.#endif$_endofproject #if$_embargo Data, which is subject to an embargo period, is not publicly accessible until the end of embargo period.#endif$_embargo #if$_request Data is made available upon request, allowing controlled sharing while ensuring responsible use.#endif$_request #if$_ipissue IP issues will be checked before publication. #endif$_ipissue All consortium partners will be encouraged to make data available before publication, openly and/or under pre-publication agreements #if$_GENOMIC such as those started in Fort Lauderdale and set forth by the Toronto International Data Release Workshop. #endif$_GENOMIC This will be implemented as soon as IP-related checks are complete.

Are the data produced and/or used in the project usable by third parties, in particular after the end of the project? If the re-use of some data is restricted, explain why.

There will be no restrictions once the data are made public.

How long is it intended that the data remains re-usable?

The data will be made available for many years#if$_DATAPLANT and ideally indefinitely after the end of the project#endif$_DATAPLANT.

Data submitted to repositories (as detailed above) e.g. ENA /PRIDE would be subject to local data storage regulation.

Are data quality assurance processes described?

The data will be checked and curated. #if$_DATAPLANT Furthermore, data will be quality controlled (QC) using automatic procedures as well as manual curation #endif$_DATAPLANT.

2.3 Allocation of resources

What are the costs for making data FAIR in your project?

The $_PROJECT will bear the costs of data curation, #if$_DATAPLANT ARC consistency checks, #endif$_DATAPLANT and data maintenance/security before transfer to public repositories. Subsequent costs are then borne by the operators of these repositories.

Additionally, costs for after publication storage are incurred by end-point repositories (e.g. ENA) but not charged against the $_PROJECT or its members but by the operation budget of these repositories.

How will these be covered? Note that costs related to open access to research data are eligible as part of the Horizon 2020 or Horizon Europe grant (if compliant with the Grant Agreement conditions).

The cost born by the $_PROJECT are covered by the project funding. Pre-existing structures #if$_DATAPLANT such as structures, tools, and knowledge laid down in the DataPLANT consortium#endif$_DATAPLANT will also be used.

Who will be responsible for data management in your project?

The responsible person will be $_DATAOFFICER of the $_PROJECT.

Are the resources for long term preservation discussed (costs and potential value, who decides and how/what data will be kept and for how long)?

The data officer #if$_PARTNERS or $_PARTNERS #endif$_PARTNERS will ultimately decides on the strategy to preserve data that are not submitted to end-point subject area repositories #if$_DATAPLANT or ARCs in DataPLANT #endif$_DATAPLANT when the project ends. This will be in line with EU guidlines, institute policies, and data sharing based on EU and international standards.

2.4 Data security

What provisions are in place for data security (including data recovery as well as secure storage and transfer of sensitive data)?

Online platforms will be protected by vulnerability scanning, two-factor authorization and daily automatic backups allowing immediate recovery. All partners holding confidential project data to use secure platforms with automatic backups and offsite secure copies. #if$_DATAPLANT DataHUB and ARCs have been generated in DataPLANT, data security will be imposed. This comprises secure storage, and the use of password and usernames is generally transferred via separate safe media.#endif$_DATAPLANT

Is the data safely stored in certified repositories for long term preservation and curation?