Home Fundamentals Research Data Management FAIR Data Principles Metadata Ontologies Data Sharing Data Publications Data Management Plan Version Control & Git Public Data Repositories Persistent Identifiers Electronic Lab Notebooks (ELN) DataPLANT Implementations Annotated Research Context ARC specification ARC Commander Swate MetadataQuiz DataHUB DataPLAN Ontology Service Landscape Manuals ARC Commander Setup Git Installation ARC Commander Installation Windows MacOS Linux ARC Commander DataHUB Access Before we start Central Functions Initialize Clone Connect Synchronize Configure Branch ISA Metadata Functions ISA Metadata Investigation Study Assay Update Export ARCitect Installation - Windows Installation - macOS Installation - Linux QuickStart QuickStart - Videos ARCmanager What is the ARCmanager? Connect to your DataHUB View your ARCs Create new ARCs Add new studies and assays Upload files Add metadata to your ARCs Swate QuickStart QuickStart - Videos Annotation tables Building blocks Building Block Types Adding a Building Block Filling cells with ontology terms Advanced Term Search File Picker Templates Contribute Templates ISA-JSON DataHUB Overview User Settings Generate a Personal Access Token (PAT) Projects Panel ARC Panel Forks Working with files ARC Settings ARC Wiki Groups Panel Create a new user group CQC Pipelines & validation Find and use ARC validation packages Data publications Passing Continuous Quality Control Submitting ARCs with ARChigator Track publication status Use your DOIs Guides ARC User Journey Create your ARC ARCitect QuickStart ARCitect QuickStart - Videos ARC Commander QuickStart ARC Commander QuickStart (Experts) Annotate Data in your ARC Annotation Principles ISA File Types Best Practices For Data Annotation Swate QuickStart Swate QuickStart - Videos Swate Walk-through Share your ARC Register at the DataHUB DataPLANT account Invite collaborators to your ARC Sharing ARCs via the DataHUB Work with your ARC Using ARCs with Galaxy Computational Workflows CWL Introduction CWL runner installation CWL Examples CWL Metadata Recommended ARC practices Syncing recommendation Keep files from syncing to the DataHUB Managing ARCs across locations Working with large data files Adding external data to the ARC ARCs in Enabling Platforms Publication to ARC Troubleshooting Git Troubleshooting & Tips Contribute Swate Templates Knowledge Base Teaching Materials Events 2023 Nov: CEPLAS PhD Module Oct: CSCS CEPLAS Start Your ARC Sept: MibiNet CEPLAS Start Your ARC July: RPTU Summer School on RDM July: Data Steward Circle May: CEPLAS Start Your ARC Series Start Your ARC Series - Videos Events 2024 TRR175 Becoming FAIR CEPLAS ARC Trainings – Spring 2024 MibiNet CEPLAS DataPLANT Tool-Workshops TRR175 Tutzing Retreat Frequently Asked Questions

CWL Metadata

last updated at 2024-02-05 CWL Metadata

Metadata plays a crucial role in enhancing the comprehensibility of CWL files. By embedding additional information about the performer and the process within the metadata, researchers can create a more comprehensive and informative description of their workflows.

Annotating a CWL or job file

CWL or job files can be annotated using ontology terms in the yaml format. They support the use of namespaces according to the schema salad specification. An example for the annotation with authorship metadata can be found here. The metadata concerning the executed run should be separated in the CWL and job file, depending on what the metadata describes. If an input for a tool, that is specified in the job file, is described, the metadata should be placed in the job file. If the metadata describes the tool itself, it should be placed in the CWL file.

In the case of a self contained tool, the corresponding metadata section could look like this and would be located in the cwl file:

arc:has technology type: - class: arc:technology type arc:annotation value: "Docker Container" arc:technology platform: ".NET" arc:performer: - class: arc:Person arc:first name: "Example" arc:last name: "Person" arc:email: "example.person@email.de " arc:affiliation: "Institution" arc:has role: - class: arc:role arc:term accession: "https://credit.niso.org/contributor-roles/formal-analysis/" arc:annotation value: "Formal analysis" arc:has process sequence: - class: arc:process sequence arc:name: "script.fsx" arc:has input: - class: arc:data arc:name: "folderIn/input.table" arc:has output: - class: arc:data arc:name: "folderout/output.table" arc:has parameter value: - class: arc:process parameter value arc:has parameter: - class: arc:protocol parameter arc:has parameter name: - class: arc:parameter name arc:term accession: "http://purl.obolibrary.org/obo/NCIT_C43582" arc:term source REF: "NCIT" arc:annotation value: "Data Transformation" arc:value: - class: arc:ontology annotation arc:term accession: "http://purl.obolibrary.org/obo/NCIT_C64911" arc:term source REF: "NCIT" arc:annotation value: "Addition" $namespaces: arc: https://github.com/nfdi4plants/ARC_ontology $schemas: - https://raw.githubusercontent.com/nfdi4plants/ARC_ontology/main/ARC_v2.0.owl

This metadata section provides information about the technology platform and the person executing the workflow. It also provides information about the tool input and output files, as well as the operations that are applied to the data. In this case, everything is encoded in the executed script and there are no variable inputs. Therefore, all metadata is written in the CWL file. An example for this can be found here.

Frequently though, tools have input parameters, that alter the tools execution or input and output files. In this case, the metadata has to be written in the right location. For a tool with varying inputs and specifiable output location, this could look as the following for the CWL file:

arc:has technology type: - class: arc:technology type arc:annotation value: "Docker Container" arc:technology platform: ".NET" arc:performer: - class: arc:Person arc:first name: "Example" arc:last name: "Person" arc:email: "example.person@email.de " arc:affiliation: "Institution" arc:has role: - class: arc:role arc:term accession: "https://credit.niso.org/contributor-roles/formal-analysis/" arc:annotation value: "Formal analysis" arc:has process sequence: - class: arc:process sequence arc:name: "script.fsx" arc:has parameter value: - class: arc:process parameter value arc:has parameter: - class: arc:protocol parameter arc:has parameter name: - class: arc:parameter name arc:term accession: "http://purl.obolibrary.org/obo/NCIT_C43582" arc:term source REF: "NCIT" arc:annotation value: "Data Transformation" arc:value: - class: arc:ontology annotation arc:term accession: "http://purl.obolibrary.org/obo/NCIT_C64911" arc:term source REF: "NCIT" arc:annotation value: "Addition"

And this for the job file:

arc:has process sequence: - class: arc:process sequence arc:has input: - class: arc:data arc:name: "folderIn/input.table" arc:has output: - class: arc:data arc:name: "folderout/output.table" $namespaces: arc: https://github.com/nfdi4plants/ARC_ontology $schemas: - https://raw.githubusercontent.com/nfdi4plants/ARC_ontology/main/ARC_v2.0.owl

Examples for this can be found here for the cwl file and here for the job file.

An application example including metadata can be found here. It contains a CWL file with the ARC mounted and a fixed script. The CWL file has two mandatory and one optional parameter. There is one job file for the execution without the optional parameter and one job file for the execution with the optional parameter. The metadata between the two job files differs by the metadata concerning the optional parameter.

Contribution Guide 📖
✏️ Edit this page