# Workflow Metadata

The file `dxworkflow.json` is a DNAnexus workflow metadata file. If a `dxworkflow.json` file is detected in the directory provided to [`dx build`](https://documentation.dnanexus.com/user/helpstrings-of-sdk-command-line-utilities#build), the toolkit attempts to build a workflow on the Platform according to the workflow specifications in the JSON file.

The format of the file resembles that of the corresponding calls to [/workflow/new](https://documentation.dnanexus.com/api/running-analyses/workflows-and-analyses#api-method-workflow-new).

The next section shows a detailed example of the fields used in the file.

## Annotated Example

The following example lists the contents of a sample `dxworkflow.json` that can be provided in a directory for use with the `dx build` command.

Comments shown below are for reference only and are not valid in the JSON format.

{% code title="dxworkflow-annotated.json" overflow="wrap" %}

```jsonc
{
  // (optional for regular, project-based workflows; required for global workflows)
  // Workflow name
  "name": "exome_variant_calling",
  // (optional) Title of a workflow, used in display, search, or listing in the UI or CLI
  "title": "Exome Variant Calling",
  // (optional for regular, project-based workflows; required for global workflows)
  // Version of the global workflow
  "version": "1.0.0",
  // (optional) A short description of the workflow
  "summary": "A simple exome pipeline",
  // (optional) Folder for the workflow's output
  "outputFolder": "/output",
  // (global workflow only) Specify a resource container that is accessible
  // to all apps/applets run by the workflow. Requires all apps/applets to be
  // compiled with the "allProjects": VIEW permission.
  "regionalOptions": {
    "aws:us-east-1": {
      "resources": "project-xxxx"
    }
  },
  // (optional) Workflow level input specification (see API documentation)
  "inputs": [
    {
      // Name of the workflow-level input
      "name": "reads",
      // Class of the workflow-level input
      "class": "array:file",
      // (optional) help for this workflow-level input
      "help": "An array of FASTQ gzipped files"
    }
  ],
  // (optional) Workflow level output specification (see API documentation)
  "outputs": [
    {
      // Name of the workflow-level output
      "name": "variants",
      // Class of the workflow-level output
      "class": "file",
      // Link to the output of the stage which provides the output of the workflow
      "outputSource": {
        "$dnanexus_link": {
          "stage": "call_variants",
          "outputField": "variants_vcfgz"
        }
      }
    }
  ],
  // (optional) A list of stages
  "stages": [
    {
      // Unique ID of the first stage
      "id": "align_reads",
      // (optional) Display name of the first stage
      "name": "BWA MEM",
      // Name or ID of the app or ID of the applet run in this stage
      "executable": "app-bwa_mem_fastq_read_mapper/2.0.4",
      // The output folder to which outputs of this stage should be cloned.
      // Folder paths can be absolute ("/foo/bar") or relative ("foo/bar") to
      // the workflow's outputFolder.
      "folder": "map_reads_output",
      // (optional) Input of the first stage
      "input": {
        // Input field name
        "genomeindex_targz": {
          // Link to a reference genome file
          "$dnanexus_link": {
            "project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
            "id": "file-B6ZY4942J35xX095VZyQBk0v"
          }
        },
        // Input field name
        "reads_fastqgzs": {
          // Link to the workflow level input; the input passed to "reads" on
          // the workflow level will be consumed by the "reads_fastqgz" input
          "$dnanexus_link": {
            "workflowInputField": "reads"
          }
        }
      },
      // (optional) Request different instance types for different entry points
      // of this stage
      "systemRequirements": {
        // "main" is the name of the entry point called when a stage is run
        "main": {
          "instanceType": "mem1_ssd1_v2_x16"
        }
      },
      // (optional) Options governing job restart policy
      "executionPolicy": {
        // Restart automatically up to 3 times for all errors
        "restartOn": {
          "*": 3
        }
      }
    },
    {
      // Unique ID of the second stage
      "id": "call_variants",
      // (optional) Display name of the second stage
      "name": "Freebayes",
      // Name or ID of the app/globalworkflow or ID of the applet/workflow
      // run in this stage
      "executable": "app-freebayes/2.0.1",
      // The output folder to which outputs of this stage should be cloned.
      // Folder paths can be absolute ("/foo/bar") or relative ("foo/bar") to
      // the workflow's outputFolder.
      "folder": "call_variants_output",
      // (optional) Input of the second stage which is linked to the output of
      // "sorted_bam" of the first stage
      "input": {
        "sorted_bams": [
          {
            "$dnanexus_link": {
              "stage": "align_reads",
              "outputField": "sorted_bam"
            }
          }
        ],
        "genome_fastagz": {
          "$dnanexus_link": {
            "project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
            "id": "file-B6ZY7VG2J35Vfvpkj8y0KZ01"
          }
        }
      }
    }
  ]
}
```

{% endcode %}

Other options for the `/workflow/new` call, such as specifying in which project or folder to create a workflow, are populated via command-line flags of [`dx build`](https://documentation.dnanexus.com/user/helpstrings-of-sdk-command-line-utilities#build).

## Specification

### `name`

**string** The name of the workflow. If it is not provided, the auto-generated workflow ID is used. When a global workflow is built (with `dx build --globalworkflow`) the name is required and more strict formatting rules apply: the name can have lower case letters, numbers, "-" , "." , and "\_" but cannot have spaces.

Example:

```json
{
  "name": "exome_variant_calling",
}
```

### `title`

**string.** The title of the workflow. It is a label displayed to the users in the Web Interface. If it is not provided, the name of the workflow is used.

Example:

```json
{
  "title": "Exome Variant Calling",
}
```

### `version`

**string** (Global workflows only). The version of the workflow. This version must be unique from all other versions of the global workflow (published or not).

We recommend following the [Semantic Versioning](https://semver.org/) conventions for numbering the versions of your global workflow. Semantic Versioning specifies how you should change the version number for specific types of updates to your global workflow. This includes bug-fix only updates, backwards compatible changes, or backwards incompatible modifications. Using the Semantic Versioning guidelines helps users and other developers understand when it is safe to move between different versions of your global workflow.

Example:

```json
{
  "version": "1.0.0",
}
```

### `summary`

**string.** A short description of the workflow.

Example:

```json
{
  "summary": "A simple exome pipeline",
}
```

### `outputFolder`

**string** (optional). The default output folder for the workflow.

Example:

```json
{
  "outputFolder": "/output",
}
```

### `inputs`

**array of mappings** (optional). JSON array containing the specifications for each input to the workflow.

Example:

```json
[
  {
    "name": "reads",
    "class": "file",
    "default": {"$dnanexus_link": "file-xxxx"}
  }
]
```

### `outputs`

**array of mappings** (optional). JSON array containing the specifications for each output of the workflow. The specification is the same as the [output specification](https://documentation.dnanexus.com/api/running-analyses/io-and-run-specifications#output-specification) of an app(let) with an addition of the `outputSource` field, which allows the workflow developer to link specific stage outputs to workflow outputs.

Example:

```json
[
  {
    "name": "variants",
    "class": "file",
    "outputSource": {"$dnanexus_link": {
        "stage": "stage_id",
        "outputField": "executable_output_fieldname"
      }
    }
  }
]
```

### `stages`

**array of mappings** (optional). A list of stages to add to the workflow. See the `stages` input field of the [`/workflow/new`](https://documentation.dnanexus.com/api/running-analyses/workflows-and-analyses#api-method-workflow-new) call for a detailed specification.

Example:

```json
{
 "stages": [
  {
   "id": "align_reads",
   "name": "BWA MEM",
   "executable": "app-bwa_mem_fastq_read_mapper/2.0.4",
   "folder": "map_reads_output",
   "input": {
    "genomeindex_targz": {
     "$dnanexus_link": {
      "project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
      "id": "file-B6ZY4942J35xX095VZyQBk0v"
     }
    },
    "reads_fastqgzs": {
     "$dnanexus_link": {
      "workflowInputField": "reads"
     }
    }
   },
   "systemRequirements": {

     "main": {
       "instanceType": "mem1_ssd1_v2_x16"
     }
   },
   "executionPolicy": {
     "restartOn": {
       "*": 3
     }
   }
  },
  {
   "id": "call_variants",
   "name": "Freebayes",
   "executable": "app-freebayes/2.0.1",
   "folder": "call_variants_output",
   "input": {
      "sorted_bams": [{
        "$dnanexus_link": {
          "stage": "align_reads",
          "outputField": "sorted_bam"
        }
      }],
     "genome_fastagz": {
       "$dnanexus_link":{
         "project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
         "id": "file-B6ZY7VG2J35Vfvpkj8y0KZ01"
       }
     }
   }
  }
 ]
}
```

### `regionalOptions`

You can specify what regions the workflow can be run in, and configure region-specific settings that control the workflow's behavior across multiple regions. The `regionalOptions` field should be a mapping with keys corresponding to each region in which the workflow should be runnable. A [region](https://documentation.dnanexus.com/developer/api/regions) is given by a string such as `aws:us-east-1`. If you don't specify `regionalOptions`, the workflow is enabled in only one region: the region of your project context when the workflow is built.

{% hint style="info" %}
**Regional Configuration Requirements**

* All underlying workflows specified across regions must have identical [input](#inputs) and [output](#outputs) specifications.
* If you specify `resources` for one region, you must specify it for all regions in `regionalOptions`.
* When used with `initializeFrom`, the `regionalOptions` must contain exactly the same regions in which the source workflow is enabled.
  {% endhint %}

The values associated with each key are themselves mappings that configure the workflow's behavior in the corresponding region. Each value of `regionalOptions` may contain the following keys:

* `workflow` **string** (required) ID of the underlying workflow of this global workflow in the corresponding region. This must be a regular workflow stored as a data object in any project. The I/O specifications of all specified underlying workflows must be identical across regions.
* `resources` **string or array of strings** (optional) Either a string containing the ID of a project that is made available as a resources container, or an array of data object IDs that are all cloned into the root folder of the resources container. All specified objects must exist in the specified region and be accessible to the workflow when it runs *in that region*. If you specify `resources` for any region in `regionalOptions`, you must specify `resources` for every region listed in `regionalOptions`.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.dnanexus.com/developer/workflows/workflow-metadata.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
