Workflows and Analyses
Workflows are objects which list a series of executables (apps or applets) and configuration parameters specifying how to run them. For example, a DNA sequencing workflow may consist of a series of 3 apps: mapping, variant calling, and variant annotation. The outputs of one executable can be configured to be inputs to the next. Each executable listed in a workflow, together with it configuration and I/O parameters, is called a stage. At the moment, workflows are not allowed to be used as executables for a stage.
An analysis is the execution of a workflow, similarly to how a job is the execution of an app. Both jobs and analyses can also be referred to as the runs of their respective executables (workflows, apps, or applets).
To create a new workflow, use the /workflow/new API method. The workflow can be edited using various API methods that support a variety of edit actions. The workflow can then be run using the /workflow-xxxx/run API method. This API method runs all the stages in the workflow and creates an analysis object, which contains metadata about all the jobs (and perhaps other analyses) that were created.
At any point after an analysis has been created, the workflow from which it was run can be recovered by calling /workflow/new with the field
initializeFrom
set to the ID of the analysis.A workflow object has an edit version number which can be retrieved using the API call /workflow-xxxx/describe. It must be provided every time an API call is made to edit a workflow and must match the current value in order to succeed. The new edit version number is returned upon a successful edit.
You can specify what stages should be run when creating the workflow using /workflow/new; you can also add additional stages after creating the workflow with /workflow-xxxx/addStage. When adding a stage, you must specify the executable that will be run. You must also specify a unique stage ID. However, when adding stages to an already existing workflow with /workflow-xxxx/addStage, the ID is optional, and if you do not supply it, a unique ID is generated on your behalf (see the section Stage ID and Name for more information).
This ID will be unique for the stage in the workflow; you will need to provide it when making further changes to the stage or cross-linking outputs and inputs of the stages.
Besides the executable that it runs, each stage can also have the following metadata:
- Name
Most of the above options can also be set when the stage is created and can always be modified afterwards via the /workflow-xxxx/update method.
Stages can be reordered or removed using the /workflow-xxxx/moveStage and /workflow-xxxx/removeStage API methods. As mentioned previously, both the stage ID and the workflow's edit version will need to be provided in order to modify them.
Replacing the executable of a stage in-place (keeping all other metadata associated with the stage such as its name, output folder, bound inputs, etc.), can only be done using the /workflow-xxxx/updateStageExecutable API method. This method will test whether the replacement candidate has input and output specifications which are fully compatible with the previous executable if it is still accessible. If it is not completely compatible, it can still be updated by setting the
force
flag to true, in which case the workflow will also be updated to remove any outdated links between stages and other such outdated metadata.Stage ID and Name
A stage ID uniquely identifies the stage within a workflow and allows inputs and outputs of different stages to be linked to each other. When adding a stage (either in /workflow/new or /workflow-xxxx/addStage) you must supply a unique ID to identify each stage. As an exception, in /workflow-xxxx/addStage it is not mandatory to supply an ID; if you do not do so, an arbitrary unique ID will be generated on your behalf.
Stage IDs must match the regular expression
^[a-zA-Z_][0-9a-zA-Z_-]{0,255}$
(only letters, numbers, underscores, and dashes, at least one char, does not start with a number or dash; maximum length is 256 characters).The stage name is a non-unique label used for display purposes. It allows you to provide a descriptive identifier for the stage that will be shown in the UI in the workflow view. If not provided, the executable's name is displayed instead.
The workflow can have a default output folder which is set by its
outputFolder
field (either at workflow creation time or through the /workflow-xxxx/update method). This value can be overridden at runtime using the folder
field. If no value for the output folder can be found in the API call nor in the workflow, then the system default of "/" is used.Stage Output Folders
Each stage can also specify its default output folder. This can be defined relative to the worflow's output folder, or as an absolute path. This field can be set in the /workflow-xxxx/addStage method and further updated using the /workflow-xxxx/update method.
If the value set for the stage's
folder
field starts with the character "/", then this is interpreted as an absolute path that will be used for the stage's outputs, regardless of what is provided as folder
in the /workflow-xxxx/run method.If, however, the value set for the field does not start with the character "/", then it is interpreted as a path relative to the field
folder
provided to /workflow-xxxx/run method.The following table shows some examples for where a stage's output will go for different values of the stage's
folder
field, under the condition that the workflow's output folder is "/foo":Stage's folder Value | Stage Output Folder |
null (no value) | "/foo" |
"bar/baz" | "/foo/bar/baz" |
"/quux" | "/quux" |
Workflow Input
It is possible to define an explicit input to the workflow by specifying
inputs
for the /workflow/new method, for example:{
"inputs": [
{
"name": "reference_genome",
"class": "file"
}
]
}
One consequence of defining a workflow with an explicit input is that once the workflow is created, all the input values will need to be provided by the user to workflow inputs and not to stages. By linking stage inputs with workflow inputs during workflow build time, all the values provided to a workflow-level input (here
reference_genome
) will be passed during execution to the stage-level input(s) that link to it.Defining
inputs
for the workflow creates a special type of a workflow called locked workflow. Locked workflows are workflows in which certain input fields cannot be overriden when the workflow is initialized to run. This is achieved by the inputs
property, which acts as an allowable list for those inputs which are "unlocked". If the workflow creator defines this property, the inputs listed in this array can be set by the user when they run the workflow (they are considered "unlocked"), and all other inputs are automatically "locked". When the inputs
property is undefined or null the workflow is fully unlocked and acts like any other regular workflow where all the inputs can be provided or overriden by the user that runs the workflow. When the inputs
property is set to an empty array, there are no unlocked fields so the workflow is fully locked.Workflow Output
The outputs of some or all of the stages can be defined to be the output of the workflow. In order to do that, the field
outputs
needs to be passed to /workflow/new, which defines references to stages' outputs in outputSource
. For example, if we'd like the output of the workflow to be the output of "outputFieldName" of the stage stage-xxxx
, but the output of other stages are of no interest to us, we can define it in the following way:{
"outputs": [
{
"name": "pipeline_output",
"class": "array:file",
"outputSource": {
"$dnanexus_link": {
"stage": "stage-xxxx",
"outputField": "output_field_of_stage_xxxx"
}
}
}
]
}
When adding an executable as a stage or modifying it using the /workflow-xxxx/update API method, you can choose to specify values for some or all of the stage inputs. These bound inputs can be overridden when the workflow is actually run. The syntax for providing bound input is the same as when providing an input hash to run the executable directly. For example, you can set the input for a stage with the hash:
{ "input_field": "input_value" }
You can also use stage references as values to link an input to the input or output of another stage. These references are hashes with a single key
$dnanexus_link
whose value is a hash with exactly two keys/values:stage
string another stage's ID whose output will be used
exactly one of the following key/values:
outputField
string the output field name of the stage's executable to be usedinputField
string the input field name of the stage's executable to be used
and, optionally:
index
integer the index of the array that is the output or input of the linked stage; this is 0-indexed, so a value of 0 indicates the first element should be used.
If the workflow has defined
inputs
, you can use workflow input references to link stage inputs to the workflow level inputs. These references are hashes with a singe key $dnanexus_link
whose value is a hash with exactly one key/value:workflowInputField
: string the input field name of the current workflow
Linking input to other stage output
Using the
outputField
option is useful for chaining the output of a stage to the input of another stage to make an analysis pipeline. For example, a first stage (stage-xxxx
) could maps reads to a reference genome and then pass those mappings on to a second stage (stage-yyyy
) that will call variants on those mappings. We can do this by setting the following input for the second stage:{
"mappings_input_field_of_stage_yyyy": {
"$dnanexus_link": {
"stage": "stage-xxxx",
"outputField": "mappings_output_field_of_stage_xxxx"
}
}
}
When the workflow is run, the second stage will receive the mappings input once the first stage has finished.
Linking input to other stage input
Linking input fields together can also be useful. For example, if there are two stages which require the same reference genome, we can link the input of one (
stage-xxxx
) to the other (stage-yyyy
) by setting the input of the first as follows:{
"reference_genome_field_of_stage_xxxx": {
"$dnanexus_link": {
"stage": "stage-yyyy",
"inputField": "reference_genome_field_of_stage_yyyy"
}
}
}
When running the workflow, the reference genome input only needs to be provided once to the input of
stage-yyyy
, and the other stage stage-xxxx
will inherit the same value.Linking workflow input to stage input
It is possible to link stage input to the input of the current workflow. For example, if the
stage-xxxx
requires a reference genome, we can link the input of stage-xxxx
to the input of the workflow as follows:{
"reference_genome_field_of_stage_xxxx": {
"$dnanexus_link": {
"workflowInputField": "reference_genome"
}
}
}
The workflow
inputs
field should then be defined for the workflow, for example:{
"inputs": [
{
"name": "reference_genome",
"class": "file"
}
]
}
During runtime stage inputs will then consume the input values provided on the workflow level, i.e. the value passed to the field
reference_genome
will be used by reference_genome_field_of_stage_xxxx
.The /workflow-xxxx/update API method can also be used to modify how an input or output to a stage can be represented as an input or output of the workflow. For example, a particular input parameter can be hidden so that it does not appear in the
inputSpec
field when describing the workflow. Or, it can be given a name (unique in the workflow) so that its stage does not have to be specified when providing input to the workflow. Its label or help can also be changed to document how it may interact with other stages in the workflow.Note that hiding an output for a stage has further consequences; the output is treated as intermediate output and is deleted after the analysis has finished running.
Each stage can have an
executionPolicy
field to request the value to be passed on when the stage is run (see the executionPolicy
field in the run specification of apps and applets for the accepted options).These stored execution policies can also change the failure propagation behavior. By default, if a stage fails, the entire analysis will enter the "partially_failed" state, and other stages will be allowed to finish successfully if they are not dependent on the failed stage(s). This behavior can be modified to propagate failure to all other stages by setting the
onNonRestartableFailure
flag in the executionPolicy
field for an individual stage to have value "failAllStages". These stage-specific options can also be overridden at runtime by providing a single value to be used by all stages in the /workflow-xxxx/run call.Each stage of the workflow can have a
systemRequirements
field to request certain instance types by default when the workflow is run. This field uses the same syntax as used in the run specification for applets and apps. This value can be set when the stage is added or modified afterwards with the /workflow-xxxx/update API method.These stored defaults can be further overridden (in part or in full) at runtime by providing the field
systemRequirements
and/or the stageSystemRequirements
fields in /workflow-xxxx/run. In particular, the value for a particular entry point in a stage's stored value for systemRequirements
will still hold unless it is overridden either explicitly (via a new value for the same entry point name) or implicitly (via a value for the "*" entry point).When running a workflow, the system will attempt to reuse previously computed results by looking up analyses that have been created for the workflow. To find out which stages have cached results on hand without running the workflow, you can call the /workflow-xxxx/dryRun method or with /workflow-xxxx/describe method with
getRerunInfo
set to true. To turn off this automatic behavior, you can request that certain stages be forcibly rerun using rerunStages
in the /workflow-xxxx/run method.When specifying input for /workflow-xxxx/run, the input field names for an analysis are automatically generated to have the form "<stage ID>.<input field name>" if the input is provided to a stage directly, or "<input field name>" if it is the input defined for the workflow.
Thus if the first stage has ID "stage-xxxx" and would run an executable which takes in an input named "reads", then to provide the input for this parameter, you would use the key "stage-xxxx.reads" in the input hash. These names can be renamed via the API call /workflow-xxxx/update using the
stages.stage-xxxx.inputSpecMods
field.Connecting the input to the input or output of another stage in the workflow is also possible. In such a situation, a workflow stage reference should be used. To reference the input of another stage, say of stage "stage-xxxx" with input "reference_genome", you would provide the value:
{ "$dnanexus_link": {
"stage": "stage-xxxx",
"inputField": "reference_genome"
}
}
When the workflow is run, this will be translated into whatever value is given as input for "reference_genome" for the stage "stage-xxxx" in the workflow.
If the key
outputField
is used in place of inputField
, then the value represents the output of that stage instead. When the workflow is run and an analysis created, the workflow stage reference will be translated into an analysis stage reference:{ "$dnanexus_link": {
"analysis": "analysis-xxxx",
"stage": "stage-xxxx",
"field": "reference_genome"
}
}
which will be resolved when the stage "stage-xxxx" finishes running in analysis "analysis-xxxx".
Specification
Creates a new workflow data object which can be used to execute a series of apps, applets, and/or workflows.
Inputs
project
string ID of the project or container to which the workflow should belong (i.e. the string "project-xxxx")name
string (optional, default is the new ID) The name of the objecttitle
string or null (optional, default null) Title of the workflow, e.g. “Micro Map Pipeline”; if null, then the name of the workflow will be used as the titlesummary
string (optional, default "") A short description of the workflowdescription
string (optional, default "") A longer description about the workflowoutputFolder
string (optional) The default output folder for the workflow; see the Customizing Output Folders section above for more details on how it interacts with stages' output folderstags
array of strings (optional) Tags to associate with the objecttypes
array of strings (optional) Types to associate with the objecthidden
boolean (optional, default false) Whether the object should be hiddenproperties
mapping (optional) Properties to associate with the object- key Property name
- value string Property value
details
mapping or array (optional, default { }) JSON object or array that is to be associated with the object; see the Object Details section for details on valid inputfolder
string (optional, default "/") Full path of the folder that is to contain the new objectparents
boolean (optional, default false) Whether all folders in the path provided infolder
should be created if they do not existinputs
array of mappings (optional) An input specification of the workflow as described in the Input Specification sectionoutputs
array of mappings (optional) An output specification of the workflow as described in the Output Specification section with an additional field specifyingoutputSource
; see the Workflow output section for detailsinitializeFrom
mapping (optional) Indicate an existing workflow or analysis from which to use the metadata as default values for all fields that are not given:id
string ID of the workflow or analysis from which to retrieve workflow metadataproject
string (required for workflow IDs and ignored otherwise) ID of the project in which the workflow specified inid
should be found
stages
array of mappings (optional) Stages to add to the workflow. If not supplied, the workflow that is created will be empty. Each value is a mapping with the key/values:id
string ID that uniquely identifies the stage. See the section on Stage ID and Name for more informationexecutable
string ID of app or applet to be run in this stagename
string (optional) Name (display label) for the stagefolder
string (optional, default is null) The output folder into which outputs should be cloned for the stage; see the Customizing Output Folders section above for more detailsinput
mapping (optional) The inputs to this stage to be bound. See the section on Binding Input for more information.- key Input field name
- value Input field value
executionPolicy
mapping (optional) A collection of options that govern automatic job restart upon certain types of failures; this can only be set at the user-level API call (jobs cannot override this for their subjobs). Contents of this field will override any of the corresponding keys in theexecutionPolicy
mapping found in the executable'srun specification (if present). Includes the following optional key/values:restartOn
mapping (optional) Indicate a job restart policy- key A restartable failure reason ("ExecutionError", "UnresponsiveWorker", "JMInternalError", or "AppInternalError") or "*" to indicate all restartable failure reasons that are otherwise not present as keys
- value int Maximum number of restarts for the failure reason
maxRestarts
int (optional, default 9) Non-negative integer less than 10, indicating the maximum number of times that the job will be restartedonNonRestartableFailure
string (optional, default "failStage") Either the value "failStage" or "failAllStages"; indicates whether the failure of this stage (when run as part of an analysis) should force all other non-terminal stages in the analysis to fail as well if a non-restartable failure occurs, even if those stages do not have any dependencies on this stage. (Stages that have dependencies on this stage will still fail irrespective of this setting.)
systemRequirements
mapping (optional) Request specific resources for the stage's executable; see the Requesting Instance Types section for more details
- ignoreReuse array of strings (optional) Specifies ids of workflow stages (or "*" for all stages) that will ignore job reuse. If a specified stage points to a nested sub-workflow, reuse will be ignored recursively by the whole nested sub-workflow. Overrides ignoreReuse setting in stage executables.
nonce
string (optional) Unique identifier for this request. Ensures that even if multiple requests fail and are retried, only a single workflow is created. For more information, see Nonces.
Outputs
id
string ID of the created workflow object (i.e. a string in the form "workflow-xxxx")editVersion
int The initial edit version number of the workflow object
Errors
- InvalidInput
- a reserved linking string ("$dnanexus_link") appears as a key in a hash in "details" but is not the only key in the hash
- a reserved linking string ("$dnanexus_link") appears as the only key in a hash in
details
but has value other than a string - the
id
given underinitializeFrom
is not a valid workflow or analysis ID - "project" is missing if
id
given underinitializeFrom
is a workflow ID - For each property key-value pair, the size, encoded in UTF-8, of the property key may not exceed 100 bytes and the property value may not exceed 700 bytes
- A
nonce
was reused in a request but some of the other inputs had changed signifying a new and different request - A
nonce
may not exceed 128 bytes
- InvalidType (
project
is not a project ID) - PermissionDenied (CONTRIBUTE access required, VIEW access required for the project specified under
initializeFrom
if a workflow or analysis was specified) - ResourceNotFound (the specified project is not found, the path in
folder
does not exist whileparents
is false, or a specified project, workflow, or analysis ID specified ininitializeFrom
is not found, or a stage inignoreReuse
is not found)
Specification
Overwrites the workflow with the workflow-specific metadata from another workflow or an analysis other than the editVersion. The workflow's name, tags, properties, types, visibility, and details are left unchanged.
Inputs
editVersion
int The edit version number that was last observed, either via/workflow-xxxx/describe
or as output from an API call that changed the workflow; this value must match the current version stored in the workflow object for the API call to succeedfrom
mapping Indicate the existing workflow or analysis from which to use the metadataid
string ID of the workflow or analysis from which to retrieve workflow metadataproject
string (required for workflow IDs and ignored otherwise) ID of the project ID in which the workflow specified inid
should be found
Outputs
id
string ID of the manipulated workfloweditVersion
int The new edit version number
Errors
- InvalidInput (input is not a hash,
editVersion
is not an integer,from
is not a hash,from.id
is not a string,from.project
is not a string iffrom.id
is a workflow ID) - ResourceNotFound (the specified workflow does not exist, or the workflow or analysis specified in
from
cannot be found) - InvalidState (workflow is not in the "open" state, or
editVersion
provided does not match the current stored value) - PermissionDenied (CONTRIBUTE access to the workflow's project required, VIEW access to the project containing the workflow or analysis represented in
from
is required)
Specification
Adds a stage to the workflow.
Inputs
editVersion
int The edit version number that was last observed, either via/workflow-xxxx/describe
or as output from an API call that changed the workflow; this value must match the current version stored in the workflow object for the API call to succeedid
string (optional) ID that uniquely identifies the stage. If not provided, a system-generated stage ID will be set. See the section on Stage ID and Name for more informationexecutable
string App or applet IDname
string or null (optional, default null) Name (display label) for the stage, or null to indicate no namefolder
string (optional, default is null) The output folder into which outputs should be cloned for the stage; see the Customizing Output Folders section above for more detailsinput
mapping (optional) A subset of the inputs to this stage to be bound. See the section on Binding Input for more information. key Input field name value Input field valueexecutionPolicy
mapping (optional) A collection of options that govern automatic job restart upon certain types of failures; this can only be set at the user-level API call (jobs cannot override this for their subjobs). Contents of this field will override any of the corresponding keys in theexecutionPolicy
mapping found in the executable's run specification (if present). Includes the following optional key/values:restartOn
mapping (optional) Indicate a job restart policy- key A restartable failure reason ("ExecutionError", "UnresponsiveWorker", "JMInternalError", or "AppInternalError") or "*" to indicate all restartable failure reasons that are otherwise not present as keys
- value int Maximum number of restarts for the failure reason
maxRestarts
int (optional, default 9) Non-negative integer less than 10, indicating the maximum number of times that the job will be restartedonNonRestartableFailure
string (optional, default "failStage") Either the value "failStage" or "failAllStages"; indicates whether the failure of this stage (when run as part of an analysis) should force all other non-terminal stages in the analysis to fail as well if a non-restartable failure occurs, even if those stages do not have any dependencies on this stage. (Stages that have dependencies on this stage will still fail irrespective of this setting.)
systemRequirements
mapping (optional) Request specific resources for the stage's executable; see the Requesting Instance Types section for more details
Outputs
id
string ID of the manipulated workfloweditVersion
int The new edit version numberstage
string ID of the new stage
Errors
- InvalidInput (input is not a hash;
editVersion
is not an integer;executable
is not a string;name
if provided is not a string;folder
if provided is not a valid folder path;input
if provided is not a hash or is not valid input for the specified executable;executionPolicy
if provided is not a hash;executionPolicy.restartOn
if provided is not a hash, contains a failure reason key that cannot be restarted, or contains a value which is not an integer between 0 and 9;executionPolicy.onNonRestartableFailure
is not one of the allowed values) - ResourceNotFound (the specified workflow does not exist, or the specified executable does not exist, or a provided input value in
input
could not be found) - InvalidState (workflow is not in the "open" state, or
editVersion
provided does not match the current stored value) - PermissionDenied (CONTRIBUTE access to the workflow's project required, or an accessible copy of the executable could not be found)
Specification
Removes a stage from the workflow.
Inputs
editVersion
int The edit version number that was last observed, either via/workflow-xxxx/describe
or as output from an API call that changed the workflow; this value must match the current version stored in the workflow object for the API call to succeedstage
string ID of the stage to remove
Outputs
id
string ID of the manipulated workfloweditVersion
int The new edit version number
Errors
- InvalidInput (input is not a hash,
editVersion
is not an integer,stage
is not a string) - ResourceNotFound (the specified workflow does not exist, or the specified stage does not exist in the workflow)
- InvalidState (workflow is not in the "open" state, or
editVersion
provided does not match the current stored value) - PermissionDenied (CONTRIBUTE access to the workflow's project required)
Specification
Reorders the stages by moving a specified stage to a new index or position in the workflow. This does not affect how the stages are run but is merely for personal preference and organization.
Inputs
editVersion
int The edit version number that was last observed, either via/workflow-xxxx/describe
or as output from an API call that changed the workflow; this value must match the current version stored in the workflow object for the API call to succeedstage
string ID of the stage to movenewIndex
int The index or key that the stage will have after the move; all other stages will be moved to accommodate this change; must be in [0, n), where n is the total number of stages
Outputs
id
string ID of the manipulated workfloweditVersion
int The new edit version number
Errors
- InvalidInput (input is not a hash,
editVersion
is not an integer,stage
is not a string,newIndex
is not in the range [0, n) where is the number of stages in the workflow) - ResourceNotFound (the specified workflow does not exist, or the specified stage does not exist in the workflow)
- InvalidState (workflow is not in the "open" state, or
editVersion
provided does not match the current stored value) - PermissionDenied (CONTRIBUTE access to the workflow's project required)
Specification
Update the workflow with any fields that are provided.
Inputs
editVersion
int The edit version number that was last observed, either via/workflow-xxxx/describe
or as output from an API call that changed the workflow; this value must match the current version stored in the workflow object for the API call to succeedtitle
string or null (optional) The workflow’s title, e.g. "Micro Map Pipeline"; if null, the name of the workflow will be used as the titlesummary
string (optional) A short description of the workflowdescription
string (optional) A longer description about the workflowoutputFolder
string or null (optional) The default output folder for the workflow, or null to unset; see the Customizing Output Folders section above for more details on how it interacts with stages' output foldersinputs
array of mappings or null (optional) An input specification of the workflow as described in the Input Specification sectionoutputs
array of mappings or null (optional) An output specification of the workflow as described in the Output Specification section with an additional field specifyingoutputSource
; see the Workflow output section for detailsstages
mapping (optional) Updates for one or more of the workflow's stages- key ID of the stage to update
- value mapping Updates to make to the stage
name
string or null (optional) New name for the stage; use null to unset the namefolder
string or null (optional) The output folder into which outputs should be cloned for the stage; see the Customizing Output Folders section above for more details; use null to unset the folderinput
mapping (optional) A subset of the inputs to this stage to be bound or unbound (using null to unset a previously-bound input). See the section on Binding Input for more information. key Input field name from this stage's executable value Input field value, or null to unsetexecutionPolicy
mapping (optional) Set the default execution policy for this stage; use the empty mapping { } to unsetstageRequirements
mapping (optional) Request specific resources for the stage's executable; see the Requesting Instance Types section for more details; use the empty mapping { } to unsetinputSpecMods
mapping (optional) Update(s) to how the stage input specification is exported for the workflow; any subset can be provided- key Input field name from this stage's executable
- value mapping Updates for the specified stage input field name
name
string or null (optional) The canonical name by which a stage's input can be addressed when running the workflow is of the form "<stage ID>.<original field name>". By providing a different string here, you will override the name as shown in theinputSpec
of the workflow, and it can be used when giving input to run the workflow. Note that the canonical name value can still be used to refer to this input, but both names cannot be used at the same time. Ifnull
is provided, then any previously-set name will be dropped, and only the canonical name can be used.label
string or null (optional) A replacement label for the input parameter. If null is provided, then any previously-set label will be dropped, and the original executable's label will be used.help
string or null (optional) A replacement help string for the input parameter. If null is provided, then any previously-set help string will be dropped, and the original executable's help string will be used.group
string or null (optional) A replacement group for the input parameter. The default group for a stage's input is the stage's ID (if it had no group in the executable), or the string "<stage ID>:<group name>" (if it was part of a group in the executable). By providing a different string here, you override the group in which the input parameter appears in theinputSpec
of the workflow. If the null value is provided, then any previously-set group value will be dropped, and the canonical group name will be used. If the empty string is provided, the parameter will not be in any group.hidden
boolean (optional) Whether to hide the input parameter from theinputSpec
of the workflow; note that the input can still be provided and overridden by its name "<stage ID>.<original field name>".
outputSpecMods
mapping (optional) Update(s) to how the stage output specification is exported for the workflow; any subset can be provided. This field follows the same syntax as forinputSpecMods
defined above and behaves roughly the same but modifiesoutputSpec
instead. The exception in behavior occurs for thehidden
field. If an output hashidden
set totrue
, its data object value (if applicable) will not be cloned into the parent container when the stage or analysis is done. This may be a useful feature if a stage in your analysis produces many intermediate outputs that are not relevant to the analysis or are not ultimately useful once the analysis has finished.
Outputs
id
string ID of the manipulated workfloweditVersion
int The new edit version number
Errors
- InvalidInput (input is not a hash,
editVersion
is not an integer,title
if provided is not a string nor null,summary
if provided is not a string,description
if provided is not a string,stages
if provided is not a hash, a key instages
is not a stage ID string,name
if provided in a stage hash is not a string,folder
if provided in a stage hash is not a valid folder path,input
if provided in a stage hash is not a hash or is not valid input for the specified executable,inputSpecMods
oroutputSpecMods
if provided in a stage hash is not a hash or contains a key which does not abide by the syntax specification above) - ResourceNotFound (the specified workflow does not exist, one of the specified stage IDs could not be found in the workflow, or a provided input value in an
input
hash in a stage's hash could not be found) - InvalidState (workflow is not in the "open" state, or
editVersion
provided does not match the current stored value) - PermissionDenied (CONTRIBUTE access to the workflow's project required)
Specification
Check whether the proposed replacement executable for a stage is going to be a fully compatible replacement or not.
Inputs
editVersion
int The edit version number that was last observed, either via/workflow-xxxx/describe
or as output from an API call that changed the workflow; this value must match the current version stored in the workflow object for the API call to succeedstage
string ID of the stage to check for compatibilityexecutable
string ID of the executable that would be used as a replacement
Outputs
id
string ID of the workflow that was checked for compatibilitycompatible
boolean The value true if it is compatible and false otherwise
If
compatible
is false, the following key is also present:incompatibilities
array of strings A list of reasons for which the two executables are not compatible
Errors
- InvalidInput (input is not a hash,
editVersion
is not an integer,stage
is not a string,executable
is not a string, the given executable is missing an input or output specification) - ResourceNotFound (the specified workflow does not exist, the specified stage does not exist in the workflow, the specified executable does not exist)
- InvalidState (workflow is not in the "open" state,
editVersion
provided does not match the current stored value) - PermissionDenied (VIEW access to the workflow's project required, or an accessible copy of the executable could not be found)
Specification
Update the executable to be run in one of the workflow's stages.
Inputs
editVersion
int The edit version number that was last observed, either via/workflow-xxxx/describe
or as output from an API call that changed the workflow; this value must match the current version stored in the workflow object for the API call to succeedstage
string ID of the stage to update with the executableexecutable
string ID of the executable to use for the stageforce
boolean (optional, default false) Whether to update the executable even if the one specified inexecutable
is incompatible with the one that is currently used for the stage
Outputs
id
string ID of the workfloweditVersion
int The new edit version numbercompatible
boolean Whetherexecutable
was compatible; if false, then further action (such as setting new inputs) may need to be taken in order to run the workflow as is
If
compatible
is false, the following is also present:incompatibilities
list of strings A list of reasons for which the two executables are not compatible
Errors
- InvalidInput (input is not a hash,
editVersion
is not an integer,stage
is not a string,executable
is not a string, the given executable is missing an input or output specification,force
is not a boolean) - ResourceNotFound (the specified workflow does not exist, the specified stage does not exist in the workflow, the specified executable does not exist)
- InvalidState (workflow is not in the "open" state,
editVersion
provided does not match the current stored value, the requested executable is not compatible with the previous executable, andforce
was not set to true) - PermissionDenied (CONTRIBUTE access to the workflow's project required, or an accessible copy of the executable could not be found)
Specification
Describes the specified workflow object.
Alternatively, you can use the /system/describeDataObjects method to describe a large number of data objects at once.
Inputs
project
string (optional) Project or container ID to be used as a hint for finding an accessible copy of the objectdefaultFields
boolean (optional, default false iffields
is supplied, true otherwise) whether to include the default set of fields in the output (the default fields are described in the "Outputs" section below). The selections are overridden by any fields explicitly named infields
.fields
mapping (optional) include or exclude the specified fields from the output. These selections override the settings indefaultFields
.- key Desired output field; see the "Outputs" section below for valid values here
- value boolean whether to include the field
includeHidden
boolean (optional; default false) Whether hidden input and output parameters should appear in theinputSpec
andoutputSpec
fieldsgetRerunInfo
boolean (optional, default false) Whether rerun information should be returned for each stagererunStages
array of strings (optional) Applicable only ifgetRerunInfo
is set to true; a set of stage IDs that would be forcibly rerun and to return rerun information accordinglyrerunProject
string (optional, default is the value ofproject
returned) Project ID to use for retrieving rerun information
The following options are deprecated (and will not be respected if
fields
is present):properties
boolean (optional, default false) Whether the properties should be returneddetails
boolean (optional, default false) Whether the details should also be returned
Outputs
id
string The object ID (i.e. the string "workflow-xxxx")
The following fields are included by default (but can be disabled using
fields
or defaultFields
):project
string ID of the project or container in which the object was foundclass
string The value "workflow"types
array of strings Types associated with the objectcreated
timestamp Time at which this object was createdstate
string Either "open" or "closed"hidden
boolean Whether the object is hidden or notlinks
array of strings The object IDs that are pointed to from this objectname
string The name of the objectfolder
string The full path to the folder containing the objectsponsored
boolean Whether the object is sponsored by DNAnexustags
array of strings Tags associated with the objectmodified
timestamp Time at which the user-provided metadata of the object was last modifiedcreatedBy
mapping How the object was createduser
string ID of the user who created the object or launched an execution which created the objectjob
string present if a job created the object ID of the job that created the objectexecutable
string present if a job created the object ID of the app or applet that the job was running
title
string The workflow's effective title (will always equalname
if it has not been set to a string)summary
string The workflow's summarydescription
string The workflow's descriptionoutputFolder
string or null The default output folder for the workflow, or null if unset; see the Customizing Output Folders section above for more details on how it interacts with stages' output foldersinputSpec
array of mappings, or null The value is null if a stage's executable is inaccessible; otherwise, the value is the effective input specification for the workflow. This is generated automatically, taking into account the stages' input specifications and any modifications that have been made to them in the context of the workflow (see the fieldinputSpecMods
under the specification for the /workflow-xxxx/update API method). If not otherwise modified via the API, the group name of an input field will be transformed to include a prefix using its stage ID. Hidden parameters are not included unless requested viaincludeHidden
. They will have a flaghidden
set totrue
. Bound inputs will always show up asdefault
values for the respective input fields.outputSpec
array of mappings, or null The value is null if a stage's executable is inaccessible; otherwise, the value is effective output specification for the workflow. This is generated automatically, taking into account the stages' output specifications and any modifications that have been made to them in the context of the workflow (see the fieldoutputSpecMods
under the specification for the /workflow-xxxx/update API method). Hidden parameters are not included unless requested viaincludeHidden
, and they will have a flaghidden
set totrue
.inputs
array of mappings, or null Input specification of the workflow (not the input of particular stages, which is returned ininputSpec
)outputs
array of mappings, or null Output specification of the workflow (not the output of stages, which is returned inoutputSpec
)editVersion
int The current edit version of the workflow; this value must be provided with any of the workflow-editing API methods to ensure that simultaneous edits are not occurringignoreReuse
array of strings, or null workflow stage ids that are configured to ignore job reusestages
array of mappings List of metadata for each stage; each value is a mapping with the key/values:id
string Stage IDexecutable
string App or applet IDname
string or null Name of the stage or null if not setfolder
string or null The output folder into which outputs should be cloned for the stage; see the Customizing Output Folders section above for more details; null if not setinput
mapping Input (possibly partial) to the stage's executable that has been boundaccessible
boolean Whether the executable is accessibleexecutionPolicy
mapping The default execution policy for this stagesystemRequirements
mapping The requestedsystemRequirements
value for the stageinputSpecMods
mapping Modifications for the stage's input parameters when represented in the workflow's input specification- key Input parameter name from this stage's executable
- value mapping Modifications for the input parameter
name
string (present if set) Replacement name of the input parameter; this is guaranteed to be unique in the stages input specificationlabel
string (present if set) Replacement label for the input parameterhelp
string (present if set) Replacement help string for the input parametergroup
string The group to which the input parameter belongs (the empty string indicates no group)hidden
boolean (present if true) Whether the input field is hidden from the workflow's input specification
outputSpecMods
mapping Modifications for restricting the stages' output and representing its output- key Output parameter name from this stage's executable
- value mapping Modifications for the output parameter with any number of the same key/values that are also present in
inputSpecMods
. Note that if an output hashidden
set totrue
, its data object value (if applicable) will not be cloned into the parent container when the stage or analysis is done and will be deleted immediately upon completion or failure of the analysis ifdelayWorkspaceDestruction
is not set to true. IfgetRerunInfo
is true, the following keys are present:
wouldBeRerun
boolean Whether the stage would be rerun if the workflow were to be run (taking into account the value given forrerunStages
, if applicable)cachedExecution
string (present ifwouldBeRerun
is false) The job ID from which the outputs would be usedcachedOutput
mapping or null (present ifwouldBeRerun
is false) The output from the cached execution if available or null if the execution has not finished yetinitializedFrom
mapping (present if the workflow was created using theinitializedFrom
option) Basic metadata recording how this workflow was createdid
string the workflow or analysis ID from which it was creatededitVersion
int (present ifid
is a workflow ID) The editVersion of the original workflow at the time of creation
The following field (included by default) is available if the object is sponsored by a third party:
sponsoredUntil
timestamp Indicates the expiration time of data sponsorship (this field is only set if the object is currently sponsored, and if set, the specified time is always in the future)
The following fields are only returned if the corresponding field in the
fields
input is set to true
:properties
mapping Properties associated with the object- key Property name
- value string Property value
details
mapping or array Contents of the object’s details
Errors
- ResourceNotFound (the specified object does not exist or the specified project does not exist)
- InvalidInput (the input is not a hash,
project
(if present) is not a string, the value ofproperties
(if present) is not a boolean,includeHidden
if present is not a boolean,getRerunInfo
if present is not a boolean,rerunStages
if present is not an array of nonempty strings) - InvalidType (
rerunProject
(if present) is not a project ID) - PermissionDenied (VIEW access required for the "project" provided (if any), and VIEW access required for some project containing the specified object (not necessarily the same as the hint provided))
Specification
All inputs must be provided, either as bound inputs in the workflow or separately in the
input
field.Intermediate results will be output for the stages and outputs specified.
If any stages have been previously run with the same executable and the same inputs, then the previous results may be used.
Inputs
name
string (optional, default is the workflow name) Name for the resulting analysisinput
mapping Input for the analysis is launched with- key Input field name; see the
inputSpec
andinputs
fields in the output of/workflow-xxxx/describe
for what the names of the inputs are - value Input field value
project
string (required if invoked by a user; optional if invoked from a job withdetach: true
option; prohibited when invoked from a job withdetach: false
) The ID of the project in which this workflow will be run (i.e., the project context). If invoked with thedetach: true
option, then the detached analysis will run under the providedproject
(if provided), otherwise project context is inherited from that of the invoking job. If invoked by a user or run as detached, all output objects are cloned into the project context; otherwise, all output objects will be cloned into the temporary workspace of the invoking job. See The Project Context and Temporary Workspace for more information.
folder
string (optional) The folder into which objects output by the analysis will be placed. If the folder does not exist when the job(s) complete, it (and any parent folders necessary) will be created. See the Customizing Output Folders section above for more details on how it interacts with stages' output folders. If no value is provided here and the workflow does not haveoutputFolder
set, then the default value is "/".stageFolders
mapping (optional) Override any stored options for the workflow stages'folder
fields. See the Customizing Output Folders section for more details- key Stage ID or "*" to indicate that the value should be applied to all stages not otherwise mentioned
- value null or string Value to replace the stored default
details
array or mapping (optional, default { }) Any conformant JSON (i.e. a JSON object or array, per RFC4627), which will be stored with the created jobdelayWorkspaceDestruction
boolean (optional) If not given, the value defaults to false for root executions (launched by a user or detached from another job), or to the parent'sdelayWorkspaceDestruction
setting. If set to true, the temporary workspace created for the resulting execution will be preserved for 3 days after the job either succeeds or fails.rerunStages
array of strings (optional) A list of stage IDs that should be forcibly rerun (which will be rerun in addition to other stages that the system will identify as requiring a rerun as well); if the list includes the string "*", then all stages will be rerunexecutionPolicy
mapping (optional) A collection of options that govern automatic job restart upon certain types of failures; this can only be set at the user-level API call (jobs cannot override this for their subjobs). Contents of this field will override any of the corresponding keys in theexecutionPolicy
mapping found in individual stages and their executables' run specification (if present). Includes the following optional key/values:restartOn
mapping (optional) Indicate a job restart policy- key A restartable failure reason ("ExecutionError", "UnresponsiveWorker", "JMInternalError", or "AppInternalError") or "*" to indicate all restartable failure reasons that are otherwise not present as keys
- value int Maximum number of restarts for the failure reason
maxRestarts
int (optional, default 9) Non-negative integer less than 10, indicating the maximum number of times that the job will be restartedonNonRestartableFailure
string (optional) If unset, allows the stages to govern their failure propagation behavior. If set, must be either the value "failStage" or "failAllStages", indicating whether the failure of any stage should propagate failure to all other non-terminal stages in the analysis, even if those stages do not have any dependencies on the failed stage. (Stages that have dependencies on the stage that failed will still fail irrespective of this setting.)
timeoutPolicyByExecutable
mapping (optional) User-specified timeout policies for all jobs in the resulting job execution tree, configurable by executable. If unspecified, it indicates that all jobs in the resulting job execution tree will have the default timeout policies present in the run specifications of their executables. If present, includes at least one of the following key-value pairs:- key Executable ID. If an executable is not explicitly specified in
timeoutPolicyByExecutable
, then any job in the resulting job execution tree that runs that executable will have the default timeout policy present in the run specification of that executable. - value mapping or null Timeout policy for the corresponding executable. A value of null overrides the default timeout policy present in the run specification of the corresponding executable and indicates that no job in the resulting job execution tree that runs the corresponding executable will have a timeout policy. If a mapping, includes at least one of the following key-value pairs:
- key Entry point name or
"*"
to indicate all entry points not explicitly specified in this mapping. If an entry point name is not explicitly specified and"*"
is not present, then any job in the resulting job execution tree that runs the corresponding executable at that entry point will have the default timeout policy present in the run specification of the corresponding executable. - value mapping or null Timeout for a job running the corresponding executable at the corresponding entry point. A value of null indicates that no job in the resulting job execution tree that runs the corresponding executable at the corresponding entry point will have a timeout. Includes at least one of the following key-value pairs:
- key Unit of time; one of "days", "hours", or "minutes".
- value number Amount of time for the corresponding time unit; must be nonnegative. The effective timeout is the sum of the units of time represented in this mapping. Note that setting the effective timeout to 0 is the same as specifying null for the corresponding executable at the corresponding entry point. Note that
timeoutPolicyByExecutable
(keyed by executable ID and then entry point name) will propagate down the entire job execution tree, and that explicitly specified upstream policies always take precedence.
systemRequirements
mapping (optional) Request specific resources for all stages not explicitly specified instageSystemRequirements
; values will be merged with stages' stored values as described in the System Requirements section. See the Requesting Instance Types section for more detailsstageSystemRequirements
mapping (optional) Request specific resources by stage; values will be merged with stages' stored values as described in the System Requirements section- key Stage ID
- value mapping Value to override or merge with the stage's
systemRequirements
value
allowSSH
array of strings (optional, default [ ]) Array of IP addresses or CIDR blocks (up to /16) from which SSH access will be allowed to the user by the worker running this job. Array may also include '*' which is interpreted as the IP address of the client issuing this API call as seen by the API server.debug
mapping (optional, default { }) Specify debugging options for running the executable; this field is only accepted when this call is made by a user (and not a job)debugOn
array of strings (optional, default [ ]) Array of job errors after which the job's worker should be kept running for debugging purposes, offering a chance to SSH into the worker before worker termination (assuming SSH has been enabled). This option applies to all jobs in the execution tree. Jobs in this state for longer than 2 days will be automatically terminated but can be terminated earlier. Allowed entries include "ExecutionError", "AppError", and "AppInternalError".
editVersion
int (optional) If provided, run the workflow only if the current version matches the provided value and throw an error if it does not match; if not provided, the current version is run.properties
mapping (optional) Properties to associate with the resulting analysis.- key Property name
- value string Property value
tags
array of strings (optional) Tags to associate with the resulting analysis.singleContext
boolean (optional) If true then the resulting jobs and all of their descendants will only be allowed to use the authentication token given to them at the onset. Use of any other authentication token will result in an error. This option offers extra security to ensure data cannot leak out of your given context. In restricted projects user-specified value is ignored, andsingleContext: true
setting is used instead.ignoreReuse
array of strings (optional) Specifies ids of workflow stages (or "*" for all stages) that will ignore job reuse. If a specified stage points to a nested sub-workflow, reuse will be ignored recursively by the whole nested sub-workflow. Overrides ignoreReuse setting in the workflow and in stage executables.