Annotated Research Context

ARCs are FAIR digital Objects (FDOs)

As such, they come along with metadata, code for operations, and a persistent identifier. ARCs are compliant to the FAIR Principles since they are​

Real-world ARC

An ARC is intended to capture complete (meta)data, including raw and processed data, but also workflow descriptions as well as essential external files. ARCs cover scenarios ranging from single experimental setups to complex experimental designs. The full ARC-specifications are provided here.

ARCs are RO-Crate certified

ARCs are a profile implementation of RO-Crates for basic plant research.

The ARC is packaged with its digital object identifier and its RO Metadata File (ro-crate-metadata.json).

This RO metadata file is automatically generated using a converter, which is able to read and transport information out of the ARC.

Tools for creating ISA model coherent ARCs

Although ARCs can be generated completely manually, we encourage you to use our convenient tools that assist you in the process.
The ARC Commander offers machine-aided creation and completion of the ARC folder and file structure.

SWATE and our templates assists you in the process of creating your assay files.

ARCs offer a single point of entry logic for data management and computation

ISA investigation file

A mandatory central registry for studies, assays, persons, … is saved as XLSX format, which follows the ISA model specification (v1.0).

Using the ARC Commander allows you to automatically fill and update the investigation file.

Every worksheet needs to contain one table object storing the metadata. Comments or additional information can be stored alongside with table objects in a worksheet.

Isa assay file

Assay metadata must be annotated in the file isa.assay.xlsx at the root of the assay's subdirectory. This workbook must contain a single assay organized in one or many worksheets. A worksheet named “assay” must store the STUDY ASSAYS section of the ISA model and is not required in the isa.investigation.xlsx. Additional worksheets must contain a table object organized on a per-row basis with the first row as column headers. Table objects must contain at least one source. Source sample relations must follow a unique path in a directed acyclic graph. Sources must be indicated by the column header Source Name, Samples accordingly by the header Sample Name.

Assays

Assays correspond to outcomes of experimental assays or analytical measurements and are treated as immutable data. Each assay is a collection of files stored in a single directory, including a mandatory metadata file in ISA-XLSX format.

Assay data files, as well as protocols, must be placed in a subdirectory individually.

We advise you to use SWATE for completing the study and assay metadata files, as it assists you in following the annotation principles and comes along with a broad range of prepared templates.

Workflows

Workflows in ARCs represent processing steps used in computational analyses and other data transformations of assays to generate run results. Typical examples include data cleaning and preprocessing, computational analysis, or visualization.

All files belonging to a specific workflow need to be stored in a single sub-directory. This also includes a mandatory per-workflow executable CWL description (v1.2 or higher), which can contain a tool or workflow description.

We highly recommend to include a reproducible execution environment description in form of a Docker container description for tool descriptions.

Studies

Studies are collections of material and resources used within the investigation. You need to place each study in a unique subdirectory. Material or experimental samples, as well as external data files, can be stored as virtual sample files (containing unique identifiers) in the resources directory. To describe the sample or material creation process, you can store protocols in the designated sub-folder.

For each study, an isa.study.xlsx file following the ISA study model needs to be present to specify the characteristics of all material and resources, such as a certain strain. Resources might include external data (e.g., knowledge files or result files) that need to be included and cannot be referenced due to external limitations. Resources described in a study file can be the input for one or multiple assays.

Runs

Runs in an ARC represent all artefacts that derive from computations on assay and external data. Plots, tables, or similar results, specific to certain run need to be saved in a subdirectory of the top-level runs directory.

A run.cwl (v1.2 or higher) is mandatory for each of these subdirectories to cover workflow description and reproducibility. These files need to be executable without additional payload files or files outside the ARC.

You can speficy input parameters for each run with a run.yml parameter file.