DNAnexus Documentation
APIDownloadsIndex of dx CommandsLegal
  • Overview
  • Getting Started
    • DNAnexus Essentials
    • Key Concepts
      • Projects
      • Organizations
      • Apps and Workflows
    • User Interface Quickstart
    • Command Line Quickstart
    • Developer Quickstart
    • Developer Tutorials
      • Bash
        • Bash Helpers
        • Distributed by Chr (sh)
        • Distributed by Region (sh)
        • SAMtools count
        • TensorBoard Example Web App
        • Git Dependency
        • Mkfifo and dx cat
        • Parallel by Region (sh)
        • Parallel xargs by Chr
        • Precompiled Binary
        • R Shiny Example Web App
      • Python
        • Dash Example Web App
        • Distributed by Region (py)
        • Parallel by Chr (py)
        • Parallel by Region (py)
        • Pysam
      • Web App(let) Tutorials
        • Dash Example Web App
        • TensorBoard Example Web App
      • Concurrent Computing Tutorials
        • Distributed
          • Distributed by Region (sh)
          • Distributed by Chr (sh)
          • Distributed by Region (py)
        • Parallel
          • Parallel by Chr (py)
          • Parallel by Region (py)
          • Parallel by Region (sh)
          • Parallel xargs by Chr
  • User
    • Login and Logout
    • Projects
      • Project Navigation
      • Path Resolution
    • Running Apps and Workflows
      • Running Apps and Applets
      • Running Workflows
      • Running Nextflow Pipelines
      • Running Batch Jobs
      • Monitoring Executions
      • Job Notifications
      • Job Lifecycle
      • Executions and Time Limits
      • Executions and Cost and Spending Limits
      • Smart Reuse (Job Reuse)
      • Apps and Workflows Glossary
      • Tools List
    • Cohort Browser
      • Chart Types
        • Row Chart
        • Histogram
        • Box Plot
        • List View
        • Grouped Box Plot
        • Stacked Row Chart
        • Scatter Plot
        • Kaplan-Meier Survival Curve
      • Locus Details Page
    • Using DXJupyterLab
      • DXJupyterLab Quickstart
      • Running DXJupyterLab
        • FreeSurfer in DXJupyterLab
      • Spark Cluster-Enabled DXJupyterLab
        • Exploring and Querying Datasets
      • Stata in DXJupyterLab
      • Running Older Versions of DXJupyterLab
      • DXJupyterLab Reference
    • Using Spark
      • Apollo Apps
      • Connect to Thrift
      • Example Applications
        • CSV Loader
        • SQL Runner
        • VCF Loader
      • VCF Preprocessing
    • Environment Variables
    • Objects
      • Describing Data Objects
      • Searching Data Objects
      • Visualizing Data
      • Filtering Objects and Jobs
      • Archiving Files
      • Relational Database Clusters
      • Symlinks
      • Uploading and Downloading Files
        • Small File Sets
          • dx upload
          • dx download
        • Batch
          • Upload Agent
          • Download Agent
    • Platform IDs
    • Organization Member Guide
    • Index of dx commands
  • Developer
    • Developing Portable Pipelines
      • dxCompiler
    • Cloud Workstation
    • Apps
      • Introduction to Building Apps
      • App Build Process
      • Advanced Applet Tutorial
      • Bash Apps
      • Python Apps
      • Spark Apps
        • Table Exporter
        • DX Spark Submit Utility
      • HTTPS Apps
        • Isolated Browsing for HTTPS Apps
      • Transitioning from Applets to Apps
      • Third Party and Community Apps
        • Community App Guidelines
        • Third Party App Style Guide
        • Third Party App Publishing Checklist
      • App Metadata
      • App Permissions
      • App Execution Environment
        • Connecting to Jobs
      • Dependency Management
        • Asset Build Process
        • Docker Images
        • Python package installation in Ubuntu 24.04 AEE
      • Job Identity Tokens for Access to Clouds and Third-Party Services
      • Enabling Web Application Users to Log In with DNAnexus Credentials
      • Types of Errors
    • Workflows
      • Importing Workflows
      • Introduction to Building Workflows
      • Building and Running Workflows
      • Workflow Build Process
      • Versioning and Publishing Global Workflows
      • Workflow Metadata
    • Ingesting Data
      • Molecular Expression Assay Loader
        • Common Errors
        • Example Usage
        • Example Input
      • Data Model Loader
        • Data Ingestion Key Steps
        • Ingestion Data Types
        • Data Files Used by the Data Model Loader
        • Troubleshooting
      • Dataset Extender
        • Using Dataset Extender
    • Dataset Management
      • Rebase Cohorts and Dashboards
      • Assay Dataset Merger
      • Clinical Dataset Merger
    • Apollo Datasets
      • Dataset Versions
      • Cohorts
    • Creating Custom Viewers
    • Client Libraries
      • Support for Python 3
    • Walkthroughs
      • Creating a Mixed Phenotypic Assay Dataset
      • Guide for Ingesting a Simple Four Table Dataset
    • DNAnexus API
      • Entity IDs
      • Protocols
      • Authentication
      • Regions
      • Nonces
      • Users
      • Organizations
      • OIDC Clients
      • Data Containers
        • Folders and Deletion
        • Cloning
        • Project API Methods
        • Project Permissions and Sharing
      • Data Object Lifecycle
        • Types
        • Object Details
        • Visibility
      • Data Object Metadata
        • Name
        • Properties
        • Tags
      • Data Object Classes
        • Records
        • Files
        • Databases
        • Drives
        • DBClusters
      • Running Analyses
        • I/O and Run Specifications
        • Instance Types
        • Job Input and Output
        • Applets and Entry Points
        • Apps
        • Workflows and Analyses
        • Global Workflows
        • Containers for Execution
      • Search
      • System Methods
      • Directory of API Methods
      • DNAnexus Service Limits
  • Administrator
    • Billing
    • Org Management
    • Single Sign-On
    • Audit Trail
    • Integrating with External Services
    • Portal Setup
    • GxP
      • Controlled Tool Access (allowed executables)
  • Science Corner
    • Scientific Guides
      • Somatic Small Variant and CNV Discovery Workflow Walkthrough
      • SAIGE GWAS Walkthrough
      • LocusZoom DNAnexus App
      • Human Reference Genomes
    • Using Hail to Analyze Genomic Data
    • Open-Source Tools by DNAnexus Scientists
    • Using IGV Locally with DNAnexus
  • Downloads
  • FAQs
    • EOL Documentation
      • Python 3 Support and Python 2 End of Life (EOL)
    • Automating Analysis Workflow
    • Backups of Customer Data
    • Developing Apps and Applets
    • Importing Data
    • Platform Uptime
    • Legal and Compliance
    • Sharing and Collaboration
    • Product Version Numbering
  • Release Notes
  • Technical Support
  • Legal
Powered by GitBook

Copyright 2025 DNAnexus

On this page
  • Annotated Example
  • Specification
  • name
  • title
  • version
  • summary
  • outputFolder
  • inputs
  • outputs
  • stages

Was this helpful?

Export as PDF
  1. Developer
  2. Workflows

Workflow Metadata

Use workflow metadata to allow the dx build command to build a workflow according to your specifications.

Last updated 2 years ago

Was this helpful?

The file dxworkflow.json is a DNAnexus workflow metadata file. If a dxworkflow.json file is detected in the directory provided to , the toolkit will attempt to build a workflow on the Platform according to the workflow specifications in the JSON file.

The format of the file closely resembles that of the corresponding calls to .

The next section shows a detailed example of the fields used in the file.

Annotated Example

The following lists the contents of an example dxworkflow.json that should be provided in a directory for use with the dx build command.

Note that comments as shown below are not written in a valid JSON format but are provided here only for reference.

{
 "name": "exome_variant_calling",     # (optional for regular, project-based workflows;
                                      # required for global workflows) Workflow name
 "title": "Exome Variant Calling",    # (optional) Title of a workflow, it is a label used when displaying,
                                      # searching, or listing the workflow in the UI or CLI
 "version": "1.0.0",                  # (optional for regular, project-based workflows; required for global workflows)
                                      # Version of the global workflow
 "summary": "A simple exome pipeline",# (optional) A short description of the workflow
 "outputFolder": "/output",           # (optional) Folder for the workflow's output
 "inputs": [                          # (optional) Workflow level input specification (see API documentation)
   {
    "name": "reads",                  # Name of the workflow-level input
    "class": "array:file",            # Class of the workflow-level input
    "help": "An array of FASTQ gzipped files"
                                      # (optional) help for this workflow-level input
   }
 ],
 "outputs": [                         # (optional) Workflow level output specification (see API documentation)
   {
    "name": "variants",               # Name of the workflow-level output
    "class": "file",                  # Class of the workflow-level output
    "outputSource": {                 # Link to the output of the stage which
       "$dnanexus_link": {            #   provides the output of the workflow
         "stage": "call_variants",
         "outputField": "variants_vcfgz"
       }
     }
   }
 ],
 "stages": [                          # (optional) A list of stages
  {
   "id": "align_reads",               # Unique ID of the first stage
   "name": "BWA MEM",                 # (optional) Display name of the first stage
   "executable": "app-bwa_mem_fastq_read_mapper/2.0.4",
                                      # Name or ID of the app or ID of the applet run in this stage
   "folder": "map_reads_output",      # The output subfolder into which the outputs of this stage should be cloned
   "input": {                         # (optional) Input of the first stage
    "genomeindex_targz": {            # Input field name
     "$dnanexus_link": {              # Link to a reference genome file
      "project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
      "id": "file-B6ZY4942J35xX095VZyQBk0v"
     }
    },
    "reads_fastqgzs": {               # Input field name
     "$dnanexus_link": {              # Link to the workflow level input; the input passed to "reads" on the
      "workflowInputField": "reads"   #   workflow level will be consumed by the "reads_fastqgz" input
     }
    }
   },
   "systemRequirements": {            # (optional) Request different instance types for different entry
                                      #   points of this stage
     "main": {                        # "main" is the name of the entry point called when a stage is run
       "instanceType": "mem1_ssd1_v2_x16"
     }
   },
   "executionPolicy": {               # (optional) Options governing job restart policy
     "restartOn": {
       "*": 3                         # Restart automatically up to 3 times for all errors
     }
   }
  },
  {
   "id": "call_variants",
                                     # Unique ID of the second stage
   "name": "Freebayes",
                                     # (optional) Display name of the second stage
   "executable": "app-freebayes/2.0.1",
                                     # Name or ID of the app/globalworkflow or ID of the applet/workflow run in this stage
   "folder": "call_variants_output", # The output subfolder into which outputs should be cloned for the stage
   "input": {                        # (optional) Input of the second stage which is linked
      "sorted_bams": [{              #   to the output of "sorted_bam" of the first stage.
        "$dnanexus_link": {
          "stage": "align_reads",
          "outputField": "sorted_bam"
        }
      }],
     "genome_fastagz": {
       "$dnanexus_link":{
         "project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
         "id": "file-B6ZY7VG2J35Vfvpkj8y0KZ01"
       }
     }
   }
  }
 ]
}

Specification

name

string. The name of the workflow. If it is not provided, the auto-generated workflow ID will be used. When a global workflow is built (with dx build --globalworkflow) the name is required and more strict formatting rules apply: the name can have lower case letters, numbers, "-" , "." , and "_" but cannot have spaces.

Example

{
...
  "name": "exome_variant_calling",
...
}

title

string. The title of the workflow. It is a label displayed to the users in the Web Interface. If it is not provided, the name of the workflow will be used.

Example

{
...
  "title": "Exome Variant Calling",
...
}

version

string (Global workflows only). The version of the workflow. This version must be unique from all other versions of the global workflow (published or not).

Example

{
...
  "version": "1.0.0",
...
}

summary

string. A short description of the workflow.

Example

{
...
  "summary": "A simple exome pipeline",
...
}

outputFolder

string (optional). The default output folder for the workflow.

Example

{
...
  "outputFolder": "/output",
}

inputs

array of mappings (optional). JSON array containing the specifications for each input to the workflow.

Example

[
  {
    "name": "reads",
    "class": "file",
    "default": {"$dnanexus_link": "file-xxxx"}
  }
]

outputs

Example

[
  {
    "name": "variants",
    "class": "file",
    "outputSource": {"$dnanexus_link": {
        "stage": "stage_id",
        "outputField": "executable_output_fieldname"
      }
    }
  }
]

stages

Example

{
...
 "stages": [
  {
   "id": "align_reads",
   "name": "BWA MEM",
   "executable": "app-bwa_mem_fastq_read_mapper/2.0.4",
   "folder": "map_reads_output",
   "input": {
    "genomeindex_targz": {
     "$dnanexus_link": {
      "project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
      "id": "file-B6ZY4942J35xX095VZyQBk0v"
     }
    },
    "reads_fastqgzs": {
     "$dnanexus_link": {
      "workflowInputField": "reads"
     }
    }
   },
   "systemRequirements": {

     "main": {
       "instanceType": "mem1_ssd1_v2_x16"
     }
   },
   "executionPolicy": {
     "restartOn": {
       "*": 3
     }
   }
  },
  {
   "id": "call_variants",
   "name": "Freebayes",
   "executable": "app-freebayes/2.0.1",
   "folder": "call_variants_output",
   "input": {
      "sorted_bams": [{
        "$dnanexus_link": {
          "stage": "align_reads",
          "outputField": "sorted_bam"
        }
      }],
     "genome_fastagz": {
       "$dnanexus_link":{
         "project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
         "id": "file-B6ZY7VG2J35Vfvpkj8y0KZ01"
       }
     }
   }
  }
 ]
}
...

Other options for the /workflow/new call, such as specifying in which project or folder to create a workflow, are populated via command-line flags of .

We recommend following the conventions for numbering the versions of your global workflow. Semantic Versioning also specifies how you should change the version number for various kinds of updates to your global workflow (that is, bug-fix only, backwards compatible, or backwards incompatible). Using the Semantic Versioning guidelines will help users and other developers to understand when it is safe to move between different versions of your global workflow.

array of mappings (optional). JSON array containing the specifications for each output of the workflow. The specification is the same as the of an app(let) with an addition of the "outputSource" field, which allows the workflow developer to link specific stage outputs to workflow outputs.

string (optional). A list of stages to add to the workflow. See the stages input field of the call for a detailed specification.

Semantic Versioning
/workflow/new
/workflow/new
output specification
dx build
dx build