Workflow Metadata
Use workflow metadata to allow the dx build command to build a workflow according to your specifications.
The file dxworkflow.json is a DNAnexus workflow metadata file. If a dxworkflow.json file is detected in the directory provided to dx build, the toolkit will attempt to build a workflow in the platform according to the workflow specifications in the JSON file.
The format of the file closely resembles that of the corresponding calls to /workflow/new.
The next section shows a detailed example of the fields used in the file.

Annotated Example

The following lists the contents of an example dxworkflow.json that should be provided in a directory for use with the dx build command.
Note that comments as shown below are not written in a valid JSON format but are provided here only for reference.
1
{
2
"name": "exome_variant_calling", # (optional for regular, project-based workflows;
3
# required for global workflows) Workflow name
4
"title": "Exome Variant Calling", # (optional) Title of a workflow, it is a label used when displaying,
5
# searching, or listing the workflow in the UI or CLI
6
"version": "1.0.0", # (optional for regular, project-based workflows; required for global workflows)
7
# Version of the global workflow
8
"summary": "A simple exome pipeline",# (optional) A short description of the workflow
9
"outputFolder": "/output", # (optional) Folder for the workflow's output
10
"inputs": [ # (optional) Workflow level input specification (see API documentation)
11
{
12
"name": "reads", # Name of the workflow-level input
13
"class": "array:file", # Class of the workflow-level input
14
"help": "An array of FASTQ gzipped files"
15
# (optional) help for this workflow-level input
16
}
17
],
18
"outputs": [ # (optional) Workflow level output specification (see API documentation)
19
{
20
"name": "variants", # Name of the workflow-level output
21
"class": "file", # Class of the workflow-level output
22
"outputSource": { # Link to the output of the stage which
23
"$dnanexus_link": { # provides the output of the workflow
24
"stage": "call_variants",
25
"outputField": "variants_vcfgz"
26
}
27
}
28
}
29
],
30
"stages": [ # (optional) A list of stages
31
{
32
"id": "align_reads", # Unique ID of the first stage
33
"name": "BWA MEM", # (optional) Display name of the first stage
34
"executable": "app-bwa_mem_fastq_read_mapper/2.0.4",
35
# Name or ID of the app or ID of the applet run in this stage
36
"folder": "map_reads_output", # The output subfolder into which the outputs of this stage should be cloned
37
"input": { # (optional) Input of the first stage
38
"genomeindex_targz": { # Input field name
39
"$dnanexus_link": { # Link to a reference genome file
40
"project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
41
"id": "file-B6ZY4942J35xX095VZyQBk0v"
42
}
43
},
44
"reads_fastqgzs": { # Input field name
45
"$dnanexus_link": { # Link to the workflow level input; the input passed to "reads" on the
46
"workflowInputField": "reads" # workflow level will be consumed by the "reads_fastqgz" input
47
}
48
}
49
},
50
"systemRequirements": { # (optional) Request different instance types for different entry
51
# points of this stage
52
"main": { # "main" is the name of the entry point called when a stage is run
53
"instanceType": "mem1_ssd1_v2_x16"
54
}
55
},
56
"executionPolicy": { # (optional) Options governing job restart policy
57
"restartOn": {
58
"*": 3 # Restart automatically up to 3 times for all errors
59
}
60
}
61
},
62
{
63
"id": "call_variants",
64
# Unique ID of the second stage
65
"name": "Freebayes",
66
# (optional) Display name of the second stage
67
"executable": "app-freebayes/2.0.1",
68
# Name or ID of the app/globalworkflow or ID of the applet/workflow run in this stage
69
"folder": "call_variants_output", # The output subfolder into which outputs should be cloned for the stage
70
"input": { # (optional) Input of the second stage which is linked
71
"sorted_bams": [{ # to the output of "sorted_bam" of the first stage.
72
"$dnanexus_link": {
73
"stage": "align_reads",
74
"outputField": "sorted_bam"
75
}
76
}],
77
"genome_fastagz": {
78
"$dnanexus_link":{
79
"project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
80
"id": "file-B6ZY7VG2J35Vfvpkj8y0KZ01"
81
}
82
}
83
}
84
}
85
]
86
}
Copied!
Other options for the /workflow/new call, such as specifying in which project or folder to create a workflow, are populated via command-line flags of dx build.

Specification

name

string. The name of the workflow. If it is not provided, the auto-generated workflow ID will be used. When a global workflow is built (with dx build --globalworkflow) the name is required and more strict formatting rules apply: the name can have lower case letters, numbers, "-" , "." , and "_" but cannot have spaces.
Example
1
{
2
...
3
"name": "exome_variant_calling",
4
...
5
}
Copied!

title

string. The title of the workflow. It is a label displayed to the users in the Web Interface. If it is not provided, the name of the workflow will be used.
Example
1
{
2
...
3
"title": "Exome Variant Calling",
4
...
5
}
Copied!

version

string (Global workflows only). The version of the workflow. This version must be unique from all other versions of the global workflow (published or not).
We recommend following the Semantic Versioning conventions for numbering the versions of your global workflow. Semantic Versioning also specifies how you should change the version number for various kinds of updates to your global workflow (that is, bug-fix only, backwards compatible, or backwards incompatible). Using the Semantic Versioning guidelines will help users and other developers to understand when it is safe to move between different versions of your global workflow.
Example
1
{
2
...
3
"version": "1.0.0",
4
...
5
}
Copied!

summary

string. A short description of the workflow.
Example
1
{
2
...
3
"summary": "A simple exome pipeline",
4
...
5
}
Copied!

outputFolder

string (optional). The default output folder for the workflow.
Example
1
{
2
...
3
"outputFolder": "/output",
4
}
Copied!

inputs

array of mappings (optional). JSON array containing the specifications for each input to the workflow.
Example
1
[
2
{
3
"name": "reads",
4
"class": "file",
5
"default": {"$dnanexus_link": "file-xxxx"}
6
}
7
]
Copied!

outputs

array of mappings (optional). JSON array containing the specifications for each output of the workflow. The specification is the same as the output specification of an app(let) with an addition of the "outputSource" field, which allows the workflow developer to link specific stage outputs to workflow outputs.
Example
1
[
2
{
3
"name": "variants",
4
"class": "file",
5
"outputSource": {"$dnanexus_link": {
6
"stage": "stage_id",
7
"outputField": "executable_output_fieldname"
8
}
9
}
10
}
11
]
Copied!

stages

string (optional). A list of stages to add to the workflow. See the stages input field of the /workflow/new call for a detailed specification.
Example
1
{
2
...
3
"stages": [
4
{
5
"id": "align_reads",
6
"name": "BWA MEM",
7
"executable": "app-bwa_mem_fastq_read_mapper/2.0.4",
8
"folder": "map_reads_output",
9
"input": {
10
"genomeindex_targz": {
11
"$dnanexus_link": {
12
"project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
13
"id": "file-B6ZY4942J35xX095VZyQBk0v"
14
}
15
},
16
"reads_fastqgzs": {
17
"$dnanexus_link": {
18
"workflowInputField": "reads"
19
}
20
}
21
},
22
"systemRequirements": {
23
24
"main": {
25
"instanceType": "mem1_ssd1_v2_x16"
26
}
27
},
28
"executionPolicy": {
29
"restartOn": {
30
"*": 3
31
}
32
}
33
},
34
{
35
"id": "call_variants",
36
"name": "Freebayes",
37
"executable": "app-freebayes/2.0.1",
38
"folder": "call_variants_output",
39
"input": {
40
"sorted_bams": [{
41
"$dnanexus_link": {
42
"stage": "align_reads",
43
"outputField": "sorted_bam"
44
}
45
}],
46
"genome_fastagz": {
47
"$dnanexus_link":{
48
"project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
49
"id": "file-B6ZY7VG2J35Vfvpkj8y0KZ01"
50
}
51
}
52
}
53
}
54
]
55
}
56
...
Copied!
Last modified 3mo ago