Workflows and Analyses
A workflow is a container that organizes multiple executable components (apps or applets) along with their configuration settings. Think of a workflow as a pipeline that connects multiple tools together. For example, a DNA sequencing workflow might include three apps in sequence: mapping, variant calling, and variant annotation. You can configure outputs from one component to feed directly into the next. Each component in a workflow, along with its settings and input/output parameters, is called a stage. You cannot use workflows as stages within other workflows.
An analysis is what happens when you run a workflow—similar to how a job runs when you execute an app. Both jobs and analyses can be referred to as runs of their respective executables.
To create a new workflow, use the /workflow/new API method. You can modify the workflow with specific API methods that support different types of edits. When ready, run the workflow with the /workflow-xxxx/run API method. This creates an analysis object that tracks all the jobs (and potentially other analyses) created during execution.
After creating an analysis, you can recreate the original workflow by calling /workflow/new with the initializeFrom parameter set to the analysis ID.
For information on building or managing Nextflow workflows, see Running Nextflow Pipelines.
Editing a Workflow
A workflow object has an edit version number which can be retrieved using the API call /workflow-xxxx/describe. It must be provided every time an API call is made to edit a workflow and must match the current value to succeed. The new edit version number is returned on a successful edit.
Managing Stages
You can specify what stages should be run when creating the workflow using /workflow/new. You can also add additional stages after creating the workflow with /workflow-xxxx/addStage. When adding a stage, you must specify the executable to run. You must also specify a unique stage ID. However, when adding stages to an already existing workflow with /workflow-xxxx/addStage, the ID is optional, and if you do not supply it, a unique ID is generated on your behalf (see the section Stage ID and Name for more information).
This ID is unique for the stage in the workflow. You need to provide it when making further changes to the stage or cross-linking outputs and inputs of the stages.
Besides the executable that it runs, each stage can also have the following metadata:
Name
IO spec modifications that are exported for the workflow
Most of the above options can also be set when the stage is created and can always be modified afterwards via the /workflow-xxxx/update method.
Stages can be reordered or removed using the /workflow-xxxx/moveStage and /workflow-xxxx/removeStage API methods. As mentioned previously, both the stage ID and the workflow's edit version need to be provided to modify them.
Replacing the executable of a stage in-place (keeping all other metadata associated with the stage such as its name, output folder, bound inputs, and configuration settings), can only be done using the /workflow-xxxx/updateStageExecutable API method. This method tests whether the replacement candidate has input and output specifications which are fully compatible with the previous executable if it is still accessible. If it is not fully compatible, it can still be updated by setting the force flag to true, in which case the workflow is also updated to remove any outdated links between stages and other such outdated metadata.
Stage ID and Name
A stage ID uniquely identifies the stage within a workflow and allows inputs and outputs of different stages to be linked to each other. When adding a stage (either in /workflow/new or /workflow-xxxx/addStage) you must supply a unique ID to identify each stage. As an exception, in /workflow-xxxx/addStage it is not mandatory to supply an ID. If you do not do so, an arbitrary unique ID is generated on your behalf.
Stage IDs must match the regular expression ^[a-zA-Z_][0-9a-zA-Z_-]{0,255}$ (only letters, numbers, underscores, and dashes, at least one char, does not start with a number or dash. Maximum length is 256 characters).
The stage name is a non-unique label used for display purposes. It allows you to provide a descriptive identifier for the stage that is shown in the UI in the workflow view. If not provided, the executable's name is displayed instead.
Customizing Output Folders
The workflow can have a default output folder which is set by its outputFolder field (either at workflow creation time or through the /workflow-xxxx/update method). This value can be overridden at runtime using the folder field. If no value for the output folder can be found in the API call nor in the workflow, then the system default of "/" is used.
Stage Output Folders
Each stage can also specify its default output folder. This can be defined relative to the workflow's output folder, or as an absolute path. This field can be set in the /workflow-xxxx/addStage method and further updated using the /workflow-xxxx/update method.
If the value set for the stage's folder field starts with the character "/", then this is interpreted as an absolute path that is used for the stage's outputs, regardless of what is provided as folder in the /workflow-xxxx/run method.
If, however, the value set for the field does not start with the character "/", then it is interpreted as a path relative to the field folder provided to /workflow-xxxx/run method.
The following table shows some examples for where a stage's output goes for different values of the stage's folder field, under the condition that the workflow's output folder is "/foo":
Stage's folder Value
Stage Output Folder
null (no value)
"/foo"
"bar/baz"
"/foo/bar/baz"
"/quux"
"/quux"
Workflow Input and Output (Locked Workflows)
Workflow Input
It is possible to define an explicit input to the workflow by specifying inputs for the /workflow/new method, for example:
{
"inputs": [
{
"name": "reference_genome",
"class": "file"
}
]
}One consequence of defining a workflow with an explicit input is that once the workflow is created, all the input values need to be provided by the user to workflow inputs and not to stages. By linking stage inputs with workflow inputs during workflow build time, all the values provided to a workflow-level input (here reference_genome) are passed during execution to the stage-level inputs that link to it.
Defining inputs for the workflow creates a special type of a workflow called locked workflow. Locked workflows are workflows in which certain input fields cannot be overridden when the workflow is initialized to run. This is achieved by the inputs property, which acts as an allowable list for those inputs which are "unlocked". If the workflow creator defines this property, the inputs listed in this array can be set by the user when they run the workflow (they are considered "unlocked"), and all other inputs are automatically "locked". When the inputs property is undefined or null the workflow is fully unlocked and acts like any other regular workflow where all the inputs can be provided or overridden by the user that runs the workflow. When the inputs property is set to an empty array, there are no unlocked fields so the workflow is fully locked.
Workflow Output
The outputs of the stages can be defined to be the output of the workflow. To do that, the field outputs needs to be passed to /workflow/new, which defines references to stages' outputs in outputSource. For example, if we'd like the output of the workflow to be the output of "outputFieldName" of the stage stage-xxxx, but the output of other stages are of no interest to us, we can define it in the following way:
{
"outputs": [
{
"name": "pipeline_output",
"class": "array:file",
"outputSource": {
"$dnanexus_link": {
"stage": "stage-xxxx",
"outputField": "output_field_of_stage_xxxx"
}
}
}
]
}Binding Input
When adding an executable as a stage or modifying it using the /workflow-xxxx/update API method, you can choose to specify values for the stage inputs. These bound inputs can be overridden when the workflow is actually run. The syntax for providing bound input is the same as when providing an input hash to run the executable directly. For example, you can set the input for a stage with the hash:
{ "input_field": "input_value" }You can also use stage references as values to link an input to the input or output of another stage. These references are hashes with a single key $dnanexus_link whose value is a hash with exactly two keys/values:
stagestring another stage's ID whose output is used
exactly one of the following key/values:
outputFieldstring the output field name of the stage's executable to be usedinputFieldstring the input field name of the stage's executable to be used
and, optionally:
indexinteger the index of the array that is the output or input of the linked stage. This is 0-indexed, so a value of 0 indicates the first element should be used.
If the workflow has defined inputs, you can use workflow input references to link stage inputs to the workflow level inputs. These references are hashes with a singe key $dnanexus_link whose value is a hash with exactly one key/value:
workflowInputField: string the input field name of the current workflow
Linking input to other stage output
Using the outputField option is useful for chaining the output of a stage to the input of another stage to make an analysis pipeline. For example, a first stage (stage-xxxx) could maps reads to a reference genome and then pass those mappings on to a second stage (stage-yyyy) that calls variants on those mappings. We can do this by setting the following input for the second stage:
{
"mappings_input_field_of_stage_yyyy": {
"$dnanexus_link": {
"stage": "stage-xxxx",
"outputField": "mappings_output_field_of_stage_xxxx"
}
}
}When the workflow is run, the second stage receives the mappings input once the first stage has finished.
Linking input to other stage input
Linking input fields together can also be useful. For example, if there are two stages which require the same reference genome, we can link the input of one (stage-xxxx) to the other (stage-yyyy) by setting the input of the first as follows:
{
"reference_genome_field_of_stage_xxxx": {
"$dnanexus_link": {
"stage": "stage-yyyy",
"inputField": "reference_genome_field_of_stage_yyyy"
}
}
}When running the workflow, the reference genome input only needs to be provided once to the input of stage-yyyy, and the other stage stage-xxxx inherits the same value.
Linking workflow input to stage input
It is possible to link stage input to the input of the current workflow. For example, if the stage-xxxx requires a reference genome, we can link the input of stage-xxxx to the input of the workflow as follows:
{
"reference_genome_field_of_stage_xxxx": {
"$dnanexus_link": {
"workflowInputField": "reference_genome"
}
}
}The workflow inputs field should then be defined for the workflow, for example:
{
"inputs": [
{
"name": "reference_genome",
"class": "file"
}
]
}During runtime, the stage inputs consume the input values provided on the workflow level. That is, the value passed to the field reference_genome is used by reference_genome_field_of_stage_xxxx.
See the section on Workflow input and output for more information.
Customizing IO Specifications
The /workflow-xxxx/update API method can also be used to modify how an input or output to a stage can be represented as an input or output of the workflow. For example, a particular input parameter can be hidden so that it does not appear in the inputSpec field when describing the workflow. Or, it can be given a name (unique in the workflow) so that its stage does not have to be specified when providing input to the workflow. Its label or help can also be changed to document how it may interact with other stages in the workflow.
Execution Policy
Each stage can have an executionPolicy field to request the value to be passed on when the stage is run (see the executionPolicy field in the run specification of apps and applets for the accepted options).
These stored execution policies can also change the failure propagation behavior. By default, if a stage fails, the entire analysis enters the partially_failed state, and other stages are allowed to finish successfully if they are not dependent on the failed stage. This behavior can be modified to propagate failure to all other stages by setting the onNonRestartableFailure flag in the executionPolicy field for an individual stage to have value "failAllStages". These stage-specific options can also be overridden at runtime by providing a single value to be used by all stages in the /workflow-xxxx/run call.
System Requirements
Each stage of the workflow can have a systemRequirements field to request certain instance types by default when the workflow is run. This field uses the same syntax as used in the run specification for applets and apps. This value can be set when the stage is added or modified afterwards with the /workflow-xxxx/update API method.
These stored defaults can be further overridden (in part or in full) at runtime by providing some combination of systemRequirementsByExecutable, systemRequirements and stageSystemRequirements fields in /workflow-xxxx/run. A stage's stored value for systemRequirements remains active for a specific entry point unless explicitly overridden with a new value for that entry point, or implicitly overridden via a value for the "*" entry point. Refer to the information in Requesting Instance Types for more details.
Reuse of Previous Results
When running a workflow, if the Smart Reuse feature has been enabled, the system attempts to reuse previously computed results by looking up analyses that have been created for the workflow. To find out which stages have cached results on hand without running the workflow, you can call the /workflow-xxxx/dryRun method or with /workflow-xxxx/describe method with getRerunInfo set to true. To turn off this automatic behavior, you can request that certain stages be forcibly rerun using rerunStages in the /workflow-xxxx/run method.
See this documentation for more on this feature.
Analysis Input
When specifying input for /workflow-xxxx/run, the input field names for an analysis are automatically generated to have the form "<stage ID>.<input field name>" if the input is provided to a stage directly, or "<input field name>" if it is the input defined for the workflow.
For example, if the first stage has ID "stage-xxxx" and would run an executable which takes in an input named "reads", then to provide the input for this parameter, you would use the key "stage-xxxx.reads" in the input hash. These names can be renamed via the API call /workflow-xxxx/update using the stages.stage-xxxx.inputSpecMods field.
Connecting the input to the input or output of another stage in the workflow is also possible. In such a situation, a workflow stage reference should be used. To reference the input of another stage, say of stage "stage-xxxx" with input "reference_genome", you would provide the value:
{ "$dnanexus_link": {
"stage": "stage-xxxx",
"inputField": "reference_genome"
}
}When the workflow is run, this is translated into whatever value is given as input for "reference_genome" for the stage "stage-xxxx" in the workflow.
If the key outputField is used in place of inputField, then the value represents the output of that stage instead. When the workflow is run and an analysis created, the workflow stage reference is translated into an analysis stage reference:
{ "$dnanexus_link": {
"analysis": "analysis-xxxx",
"stage": "stage-xxxx",
"field": "reference_genome"
}
}which is resolved when the stage "stage-xxxx" finishes running in analysis "analysis-xxxx".
Workflow API Method Specifications
API method: /workflow/new
/workflow/newSpecification
Creates a new workflow data object which can be used to execute a series of apps, applets, and/or workflows.
Inputs
projectstring ID of the project or container to which the workflow should belong, such as "project-xxxx".namestring (optional, default is the new ID) The name of the objecttitlestring or null (optional, default null) Title of the workflow, for example, "Micro Map Pipeline". If null, then the name of the workflow is used as the titlesummarystring (optional, default "") A short description of the workflowdescriptionstring (optional, default "") A longer description about the workflowoutputFolderstring (optional) The default output folder for the workflow. See the Customizing Output Folders section above for more details on how it interacts with stages' output folderstagsarray of strings (optional) Tags to associate with the objecttypesarray of strings (optional) Types to associate with the objecthiddenboolean (optional, default false) Whether the object should be hiddenpropertiesmapping (optional) Properties to associate with the objectkey Property name
value string Property value
detailsmapping or array (optional, default { }) JSON object or array that is to be associated with the object. See the Object Details section for details on valid inputfolderstring (optional, default "/") Full path of the folder that is to contain the new objectparentsboolean (optional, default false) Whether all folders in the path provided infoldershould be created if they do not existinputsarray of mappings (optional) An input specification of the workflow as described in the Input Specification sectionoutputsarray of mappings (optional) An output specification of the workflow as described in the Output Specification section with an additional field specifyingoutputSource. See the Workflow output section for detailsinitializeFrommapping (optional) Indicate an existing workflow or analysis from which to use the metadata as default values for all fields that are not given:idstring ID of the workflow or analysis from which to retrieve workflow metadataprojectstring (required for workflow IDs and ignored otherwise) ID of the project in which the workflow specified inidshould be found
stagesarray of mappings (optional) Stages to add to the workflow. If not supplied, the workflow that is created is empty. Each value is a mapping with the key/values:idstring ID that uniquely identifies the stage. See the section on Stage ID and Name for more informationexecutablestring ID of app or applet to be run in this stagenamestring (optional) Name (display label) for the stagefolderstring (optional, default is null) The output folder into which outputs should be cloned for the stage. See the Customizing Output Folders section above for more detailsinputmapping (optional) The inputs to this stage to be bound. See the section on Binding Input for more information.key Input field name
value Input field value
executionPolicymapping (optional) A collection of options that govern automatic job restart on certain types of failures. This can only be set at the user-level API call (jobs cannot override this for their subjobs). Contents of this field override any of the corresponding keys in theexecutionPolicymapping found in the executable's run specification (if present). Includes the following optional key/values:restartOnmapping (optional) Indicate a job restart policykey A restartable failure reason (
ExecutionError,UnresponsiveWorker,JMInternalError,AppInternalError,JobTimeoutExceeded, orSpotInstanceInterruption) or*to indicate all restartable failure reasons that are otherwise not present as keysvalue int Maximum number of restarts for the failure reason
maxRestartsint (optional, default 9) Non-negative integer less than 10, indicating the maximum number of times that the job is restartedonNonRestartableFailurestring (optional, defaultfailStage) Either the valuefailStageorfailAllStages. Indicates whether the failure of this stage (when run as part of an analysis) should force all other non-terminal stages in the analysis to fail as also if a non-restartable failure occurs, even if those stages do not have any dependencies on this stage. (Stages that have dependencies on this stage still fail irrespective of this setting.)
systemRequirementsmapping (optional) Request specific resources for the stage's executable. See the Requesting Instance Types section for more details
ignoreReusearray of strings (optional) Specifies ids of workflow stages (or "*" for all stages) that ignore job reuse. If a specified stage points to a nested sub-workflow, reuse is ignored recursively by the whole nested sub-workflow. OverridesignoreReusesetting in stage executables.noncestring (optional) Unique identifier for this request. Ensures that even if multiple requests fail and are retried, only a single workflow is created. For more information, see Nonces.treeTurnaroundTimeThresholdinteger (optional, default: ThetreeTurnaroundTimeThresholdfield of theinitializeFromworkflow if thebillToof the project hasjobNotificationsenabled andinitializeFromis not N/A, otherwise N/A, with N/A meaning not available.) The turnaround time threshold (in seconds) for trees (specifically, root executions) that run this executable. See Job Notifications for more information about turnaround time and managing job notifications.
Outputs
idstring ID of the created workflow object, such as "workflow-xxxx".editVersionint The initial edit version number of the workflow object
Errors
InvalidInput
A reserved linking string (
$dnanexus_link) appears as a key in a hash in "details" but is not the only key in the hashA reserved linking string (
$dnanexus_link) appears as the only key in a hash indetailsbut has value other than a stringThe
idgiven underinitializeFromis not a valid workflow or analysis ID"project" is missing if
idgiven underinitializeFromis a workflow IDFor each property key-value pair, the size, encoded in UTF-8, of the property key may not exceed 100 bytes and the property value may not exceed 700 bytes
A
noncewas reused in a request but other inputs had changed signifying a new and different requestA
noncemay not exceed 128 bytes
InvalidType
The projectis not a valid project ID
PermissionDenied
CONTRIBUTE access required, VIEW access required for the project specified under
initializeFromif a workflow or analysis was specified.
ResourceNotFound
The specified project is not found
The path in
folderdoes not exist whileparentsis false, or a specified project, workflow, or analysis ID specified ininitializeFromis not found, or a stage inignoreReuseis not found)
API method: /workflow-xxxx/overwrite
/workflow-xxxx/overwriteSpecification
Overwrites the workflow with the workflow-specific metadata from another workflow or an analysis other than the editVersion. The workflow's name, tags, properties, types, visibility, and details are left unchanged.
Inputs
editVersionint The edit version number that was last observed, either via/workflow-xxxx/describeor as output from an API call that changed the workflow. This value must match the current version stored in the workflow object for the API call to succeedfrommapping Indicate the existing workflow or analysis from which to use the metadataidstring ID of the workflow or analysis from which to retrieve workflow metadataprojectstring (required for workflow IDs and ignored otherwise) ID of the project ID in which the workflow specified inidshould be found
Outputs
idstring ID of the manipulated workfloweditVersionint The new edit version number
Errors
InvalidInput
Input is not a hash
editVersionis not an integerfromis not a hashfrom.idis not a stringfrom.projectis not a string iffrom.idis a workflow ID
ResourceNotFound
The specified workflow does not exist
The workflow or analysis specified in
fromcannot be found
InvalidState
Workflow is not in the "open" state
editVersionprovided does not match the current stored value
PermissionDenied
User does not have CONTRIBUTE access to the workflow's project
User does not have VIEW access to the project containing the workflow or analysis represented in
from
API method: /workflow-xxxx/addStage
/workflow-xxxx/addStageSpecification
Adds a stage to the workflow.
Inputs
editVersionint The edit version number that was last observed, either via/workflow-xxxx/describeor as output from an API call that changed the workflow. This value must match the current version stored in the workflow object for the API call to succeedidstring (optional) ID that uniquely identifies the stage. If not provided, a system-generated stage ID is set. See the section on Stage ID and Name for more informationexecutablestring App or applet IDnamestring or null (optional, default null) Name (display label) for the stage, or null to indicate no namefolderstring (optional, default is null) The output folder into which outputs should be cloned for the stage. See the Customizing Output Folders section above for more detailsinputmapping (optional) A subset of the inputs to this stage to be bound. See the section on Binding Input for more information. key Input field name value Input field valueexecutionPolicymapping (optional) A collection of options that govern automatic job restart on certain types of failures. This can only be set at the user-level API call (jobs cannot override this for their subjobs). Contents of this field override any of the corresponding keys in theexecutionPolicymapping found in the executable's run specification (if present). Includes the following optional key/values:restartOnmapping (optional) Indicate a job restart policykey A restartable failure reason (
ExecutionError,UnresponsiveWorker,JMInternalError,AppInternalError,JobTimeoutExceeded, orSpotInstanceInterruption) or*to indicate all restartable failure reasons that are otherwise not present as keysvalue int Maximum number of restarts for the failure reason
maxRestartsint (optional, default 9) Non-negative integer less than 10, indicating the maximum number of times that the job is restartedonNonRestartableFailurestring (optional, defaultfailStage) Either the valuefailStageorfailAllStages. Indicates whether the failure of this stage (when run as part of an analysis) should force all other non-terminal stages in the analysis to fail as also if a non-restartable failure occurs, even if those stages do not have any dependencies on this stage. (Stages that have dependencies on this stage still fail irrespective of this setting.)
systemRequirementsmapping (optional) Request specific resources for the stage's executable. See the Requesting Instance Types section for more details
Outputs
idstring ID of the manipulated workfloweditVersionint The new edit version numberstagestring ID of the new stage
Errors
InvalidInput
Input is not a hash
editVersionis not an integerexecutableis not a stringnameif provided is not a stringfolderif provided is not a valid folder pathinputif provided is not a hash or is not valid input for the specified executableexecutionPolicyif provided is not a hashexecutionPolicy.restartOnif provided is not a hash, contains a failure reason key that cannot be restarted, or contains a value which is not an integer between 0 and 9executionPolicy.onNonRestartableFailureis not one of the allowed values
ResourceNotFound
The specified workflow does not exist
The specified executable does not exist
A provided input value in
inputcould not be found
InvalidState
Workflow is not in the "open" state
editVersionprovided does not match the current stored value
PermissionDenied
User does not have CONTRIBUTE access to the workflow's project
An accessible copy of the executable could not be found
API method: /workflow-xxxx/removeStage
/workflow-xxxx/removeStageSpecification
Removes a stage from the workflow.
Inputs
editVersionint The edit version number that was last observed, either via/workflow-xxxx/describeor as output from an API call that changed the workflow. This value must match the current version stored in the workflow object for the API call to succeedstagestring ID of the stage to remove
Outputs
idstring ID of the manipulated workfloweditVersionint The new edit version number
Errors
InvalidInput
Input is not a hash
editVersionis not an integerstageis not a string
ResourceNotFound
The specified workflow does not exist
The specified stage does not exist in the workflow
InvalidState
Workflow is not in the "open" state
editVersionprovided does not match the current stored value
PermissionDenied
User does not have CONTRIBUTE access to the workflow's project
API method: /workflow-xxxx/moveStage
/workflow-xxxx/moveStageSpecification
Reorders the stages by moving a specified stage to a new index or position in the workflow. This does not affect how the stages are run but is merely for personal preference and organization.
Inputs
editVersionint The edit version number that was last observed, either via/workflow-xxxx/describeor as output from an API call that changed the workflow. This value must match the current version stored in the workflow object for the API call to succeedstagestring ID of the stage to movenewIndexint The index or key that the stage has after the move. All other stages are moved to accommodate this change. Must be in [0, n), where n is the total number of stages
Outputs
idstring ID of the manipulated workfloweditVersionint The new edit version number
Errors
InvalidInput
Input is not a hash
editVersionis not an integerstageis not a stringnewIndexis not in the range [0, n) where is the number of stages in the workflow
ResourceNotFound
The specified workflow does not exist
The specified stage does not exist in the workflow)
InvalidState
Workflow is not in the "open" state
editVersionprovided does not match the current stored value
PermissionDenied
User does not have CONTRIBUTE access to the workflow's project
API method: /workflow-xxxx/update
/workflow-xxxx/updateSpecification
Update the workflow with any fields that are provided.
Inputs
editVersionint The edit version number that was last observed, either via/workflow-xxxx/describeor as output from an API call that changed the workflow. This value must match the current version stored in the workflow object for the API call to succeedtitlestring or null (optional) The workflow's title, for example, "Micro Map Pipeline". If null, the name of the workflow is used as the titlesummarystring (optional) A short description of the workflowdescriptionstring (optional) A longer description about the workflowoutputFolderstring or null (optional) The default output folder for the workflow, or null to unset. See the Customizing Output Folders section above for more details on how it interacts with stages' output foldersinputsarray of mappings or null (optional) An input specification of the workflow as described in the Input Specification sectionoutputsarray of mappings or null (optional) An output specification of the workflow as described in the Output Specification section with an additional field specifyingoutputSource. See the Workflow output section for detailsstagesmapping (optional) Updates for one or more of the workflow's stageskey ID of the stage to update
value mapping Updates to make to the stage
namestring or null (optional) New name for the stage. Use null to unset the namefolderstring or null (optional) The output folder into which outputs should be cloned for the stage. See the Customizing Output Folders section above for more details. Use null to unset the folderinputmapping (optional) A subset of the inputs to this stage to be bound or unbound (using null to unset a previously-bound input). See the section on Binding Input for more information. key Input field name from this stage's executable value Input field value, or null to unsetexecutionPolicymapping (optional) Set the default execution policy for this stage. Use the empty mapping { } to unsetstageRequirementsmapping (optional) Request specific resources for the stage's executable. See the Requesting Instance Types section for more details. Use the empty mapping { } to unsetinputSpecModsmapping (optional) Updates to how the stage input specification is exported for the workflow. Any subset can be providedkey Input field name from this stage's executable
value mapping Updates for the specified stage input field name
namestring or null (optional) The canonical name by which a stage's input can be addressed when running the workflow is of the form "<stage ID>.<original field name>". By providing a different string here, you override the name as shown in theinputSpecof the workflow, and it can be used when giving input to run the workflow. The canonical name value can still be used to refer to this input, but both names cannot be used simultaneously. Ifnullis provided, then any previously-set name is dropped. Only the canonical name can be used.labelstring or null (optional) A replacement label for the input parameter. If null is provided, then any previously-set label is dropped. The original executable's label is used.helpstring or null (optional) A replacement help string for the input parameter. If null is provided, then any previously-set help string is dropped and the original executable's help string is used.groupstring or null (optional) A replacement group for the input parameter. The default group for a stage's input is the stage's ID (if it had no group in the executable), or the string "<stage ID>:<group name>" (if it was part of a group in the executable). By providing a different string here, you override the group in which the input parameter appears in theinputSpecof the workflow. If the null value is provided, then any previously-set group value is dropped. The canonical group name is used. If the empty string is provided, the parameter is not in any group.hiddenboolean (optional) Whether to hide the input parameter from theinputSpecof the workflow. The input can still be provided and overridden by its name "<stage ID>.<original field name>".
outputSpecModsmapping (optional) Updates to how the stage output specification is exported for the workflow. Any subset can be provided. This field follows the same syntax as forinputSpecModsdefined above and behaves roughly the same but modifiesoutputSpecinstead. The exception in behavior occurs for thehiddenfield. If an output hashiddenset totrue, its data object value (if applicable) is not cloned into the parent container when the stage or analysis is done. This may be a useful feature if a stage in your analysis produces many intermediate outputs that are not relevant to the analysis or are not ultimately useful once the analysis has finished.
Outputs
idstring ID of the manipulated workfloweditVersionint The new edit version number
Errors
InvalidInput
Input is not a hash
editVersionis not an integertitleif provided is not a string nor nullsummaryif provided is not a stringdescriptionif provided is not a stringstagesif provided is not a hashA key in
stagesis not a stage ID stringnameif provided in a stage hash is not a stringfolderif provided in a stage hash is not a valid folder pathinputif provided in a stage hash is not a hash or is not valid input for the specified executableinputSpecModsoroutputSpecModsif provided in a stage hash is not a hash or contains a key which does not abide by the syntax specification above
ResourceNotFound
The specified workflow does not exist
One of the specified stage IDs could not be found in the workflow
A provided input value in an
inputhash in a stage's hash could not be found
InvalidState
Workflow is not in the "open" state
editVersionprovided does not match the current stored value
PermissionDenied
User does not have CONTRIBUTE access to the workflow's project
API method: /workflow-xxxx/isStageCompatible
/workflow-xxxx/isStageCompatibleSpecification
Check whether the proposed replacement executable for a stage is going to be a fully compatible replacement or not.
Inputs
editVersionint The edit version number that was last observed, either via/workflow-xxxx/describeor as output from an API call that changed the workflow. This value must match the current version stored in the workflow object for the API call to succeedstagestring ID of the stage to check for compatibilityexecutablestring ID of the executable that would be used as a replacement
Outputs
idstring ID of the workflow that was checked for compatibilitycompatibleboolean The value true if it is compatible and false otherwise
If compatible is false, the following key is also present:
incompatibilitiesarray of strings A list of reasons for which the two executables are not compatible
Errors
InvalidInput
Input is not a hash
editVersionis not an integerstageis not a stringexecutableis not a stringThe given executable is missing an input or output specification
ResourceNotFound
The specified workflow does not exist
The specified stage does not exist in the workflow
The specified executable does not exist
InvalidState
Workflow is not in the "open" state
editVersionprovided does not match the current stored value
PermissionDenied
User does not have VIEW access to the workflow's project required
An accessible copy of the executable could not be found
API method: /workflow-xxxx/updateStageExecutable
/workflow-xxxx/updateStageExecutableSpecification
Update the executable to be run in one of the workflow's stages.
Inputs
editVersionint The edit version number that was last observed, either via/workflow-xxxx/describeor as output from an API call that changed the workflow. This value must match the current version stored in the workflow object for the API call to succeedstagestring ID of the stage to update with the executableexecutablestring ID of the executable to use for the stageforceboolean (optional, default false) Whether to update the executable even if the one specified inexecutableis incompatible with the one that is in use for the stage
Outputs
idstring ID of the workfloweditVersionint The new edit version numbercompatibleboolean Whetherexecutablewas compatible. If false, then further action (such as setting new inputs) may need to be taken to run the workflow as is
If compatible is false, the following is also present:
incompatibilitieslist of strings A list of reasons for which the two executables are not compatible
Errors
InvalidInput
Input is not a hash
editVersionis not an integerstageis not a string,executableis not a stringThe given executable is missing an input or output specification
forceis not a boolean
ResourceNotFound
The specified workflow does not exist
The specified stage does not exist in the workflow
The specified executable does not exist
InvalidState
Workflow is not in the "open" state
editVersionprovided does not match the current stored valueThe requested executable is not compatible with the previous executable
forcewas not set to true
PermissionDenied
User does not have CONTRIBUTE access to the workflow's project
An accessible copy of the executable could not be found
API method: /workflow-xxxx/describe
/workflow-xxxx/describeSpecification
Describes the specified workflow object.
Alternatively, you can use the /system/describeDataObjects method to describe many data objects at once.
Inputs
projectstring (optional) Project or container ID to be used as a hint for finding an accessible copy of the objectdefaultFieldsboolean (optional, default false iffieldsis supplied, true otherwise) whether to include the default set of fields in the output (the default fields are described in the "Outputs" section below). The selections are overridden by any fields explicitly named infields.fieldsmapping (optional) include or exclude the specified fields from the output. These selections override the settings indefaultFields.key Desired output field. See the "Outputs" section below for valid values here
value boolean whether to include the field
includeHiddenboolean (optional, default false) Whether hidden input and output parameters should appear in theinputSpecandoutputSpecfieldsgetRerunInfoboolean (optional, default false) Whether rerun information should be returned for each stagererunStagesarray of strings (optional) Applicable only ifgetRerunInfois set to true. A set of stage IDs that would be forcibly rerun and to return rerun information accordinglyrerunProjectstring (optional, default is the value ofprojectreturned) Project ID to use for retrieving rerun information
The following options are deprecated (and are not respected if fields is present):
propertiesboolean (optional, default false) Whether the properties should be returneddetailsboolean (optional, default false) Whether the details should also be returned
Outputs
idstring The object ID, such as "workflow-xxxx".
The following fields are included by default (but can be disabled using fields or defaultFields):
projectstring ID of the project or container in which the object was foundclassstring The value "workflow"typesarray of strings Types associated with the objectcreatedtimestamp Time at which this object was createdstatestring Either "open" or "closed"hiddenboolean Whether the object is hidden or notlinksarray of strings The object IDs that are pointed to from this objectnamestring The name of the objectfolderstring The full path to the folder containing the objectsponsoredboolean Whether the object is sponsored by DNAnexustagsarray of strings Tags associated with the objectmodifiedtimestamp Time at which the user-provided metadata of the object was last modifiedcreatedBymapping How the object was createduserstring ID of the user who created the object or launched an execution which created the objectjobstring present if a job created the object ID of the job that created the objectexecutablestring present if a job created the object ID of the app or applet that the job was running
titlestring The workflow's effective title (always equalsnameif it has not been set to a string)summarystring The workflow's summarydescriptionstring The workflow's descriptionoutputFolderstring or null The default output folder for the workflow, or null if unset. See the Customizing Output Folders section above for more details on how it interacts with stages' output foldersinputSpecarray of mappings, or null The value is null for inaccessible stage executables. Otherwise, the value is the effective input specification for the workflow. This is generated automatically, taking into account the stages' input specifications and any modifications that have been made to them in the context of the workflow (see the fieldinputSpecModsunder the specification for the /workflow-xxxx/update API method). If not otherwise modified via the API, the group name of an input field is transformed to include a prefix using its stage ID. Hidden parameters are not included unless requested viaincludeHidden. They have a flaghiddenset totrue. Bound inputs always show up asdefaultvalues for the respective input fields.outputSpecarray of mappings, or null The value is null if a stage's executable is inaccessible. Otherwise, the value is effective output specification for the workflow. This is generated automatically, taking into account the stages' output specifications and any modifications that have been made to them in the context of the workflow (see the fieldoutputSpecModsunder the specification for the /workflow-xxxx/update API method). Hidden parameters are not included unless requested viaincludeHiddenand they have a flaghiddenset totrue.inputsarray of mappings, or null Input specification of the workflow (not the input of particular stages, which is returned ininputSpec)outputsarray of mappings, or null Output specification of the workflow (not the output of stages, which is returned inoutputSpec)editVersionint The current edit version of the workflow. This value must be provided with any of the workflow-editing API methods to ensure that simultaneous edits are not occurringignoreReusearray of strings, or null Workflow stage ids that are configured to ignore job reusestagesarray of mappings List of metadata for each stage. Each value is a mapping with the key/values:idstring Stage IDexecutablestring App or applet IDnamestring or null Name of the stage or null if not setfolderstring or null The output folder into which outputs should be cloned for the stage. See the Customizing Output Folders section above for more details. null if not setinputmapping Input (possibly partial) to the stage's executable that has been boundaccessibleboolean Whether the executable is accessibleexecutionPolicymapping The default execution policy for this stagesystemRequirementsmapping The requestedsystemRequirementsvalue for the stageinputSpecModsmapping Modifications for the stage's input parameters when represented in the workflow's input specificationkey Input parameter name from this stage's executable
value mapping Modifications for the input parameter
namestring (present if set) Replacement name of the input parameter. This is guaranteed to be unique in the stages input specificationlabelstring (present if set) Replacement label for the input parameterhelpstring (present if set) Replacement help string for the input parametergroupstring The group to which the input parameter belongs (the empty string indicates no group)hiddenboolean (present if true) Whether the input field is hidden from the workflow's input specification
outputSpecModsmapping Modifications for restricting the stages' output and representing its outputkey Output parameter name from this stage's executable
value mapping Modifications for the output parameter with any number of the same key/values that are also present in
inputSpecMods. If an output hashiddenset totrue, its data object value (if applicable) is not cloned into the parent container when the stage or analysis is done and is deleted immediately on completion or failure of the analysis ifdelayWorkspaceDestructionis not set to true. IfgetRerunInfois true, the following keys are present:
wouldBeRerunboolean Whether the stage would be rerun if the workflow were to be run (taking into account the value given forrerunStages, if applicable)cachedExecutionstring (present ifwouldBeRerunis false) The job ID from which the outputs would be usedcachedOutputmapping or null (present ifwouldBeRerunis false) The output from the cached execution if available or null if the execution has not finished yetinitializedFrommapping (present if the workflow was created using theinitializedFromoption) Basic metadata recording how this workflow was creatededitVersionint (present ifidis a workflow ID) TheeditVersionof the original workflow at the time of creationtreeTurnaroundTimeThresholdinteger or null The turnaround time threshold (in seconds) for trees (specifically, root executions) that run this executable. See Job Notifications for more information about turnaround time and managing job notifications.
The following field (included by default) is available if the object is sponsored by a third party:
sponsoredUntiltimestamp Indicates the expiration time of data sponsorship (this field is only set if the object is sponsored, and if set, the specified time is always in the future)
The following fields are only returned if the corresponding field in the fields input is set to true:
propertiesmapping Properties associated with the objectkey Property name
value string Property value
detailsmapping or array Contents of the object's details
Errors
ResourceNotFound
The specified object does not exist or the specified project does not exist
InvalidInput
The input is not a hash
project(if present) is not a stringThe value of
properties(if present) is not a booleanincludeHiddenif present is not a booleangetRerunInfoif present is not a booleanrerunStagesif present is not an array of nonempty strings
InvalidType
rerunProject(if present) is not a project ID
PermissionDenied
VIEW access is required for the
projectinput if providedVIEW access is required for some project containing the specified object, may be different from the
projectinput provided.
API method: /workflow-xxxx/run
/workflow-xxxx/runSpecification
All inputs must be provided, either as bound inputs in the workflow or separately in the input field.
Intermediate results are output for the stages and outputs specified.
If any stages have been previously run with the same executable and the same inputs, then the previous results may be used.
Inputs
namestring (optional, default is the workflow name) Name for the resulting analysisinputmapping Input for the analysis is launched withkey Input field name. See the
inputSpecandinputsfields in the output of/workflow-xxxx/describefor what the names of the inputs arevalue Input field value
projectstring (required if invoked by a user. Optional if invoked from a job withdetach: trueoption. Prohibited when invoked from a job withdetach: false) The ID of the project in which this workflow runs, also known as the project context. If invoked with thedetach: trueoption, then the detached analysis runs under the providedproject(if provided), otherwise project context is inherited from that of the invoking job. If invoked by a user or run as detached, all output objects are cloned into the project context. Otherwise, all output objects are cloned into the temporary workspace of the invoking job. For more information, see The Project Context and Temporary Workspace.
folderstring (optional) The folder into which objects output by the analysis are placed. If the folder does not exist when the job is complete, the folder is created, along with any parent folders necessary. See the Customizing Output Folders section above for more details on how it interacts with stages' output folders. If no value is provided here and the workflow does not haveoutputFolderset, then the default value is "/".stageFoldersmapping (optional) Override any stored options for the workflow stages'folderfields. See the Customizing Output Folders section for more detailskey Stage ID or "*" to indicate that the value should be applied to all stages not otherwise mentioned
value null or string Value to replace the stored default
detailsarray or mapping (optional, default { }) Any conformant JSON, which is defined as a JSON object or array per RFC4627. This is stored with the created job.delayWorkspaceDestructionboolean (optional) If not given, the value defaults to false for root executions (launched by a user or detached from another job), or to the parent'sdelayWorkspaceDestructionsetting. If set to true, the temporary workspace created for the resulting execution is preserved for 3 days after the job either succeeds or fails.rerunStagesarray of strings (optional) A list of stage IDs that should be forcibly rerun. The system automatically identifies stages requiring rerun, and this parameter adds specific stages to that list. If the list includes the string "*", then all stages are rerunexecutionPolicymapping (optional) A collection of options that govern automatic job restart on certain types of failures. This can only be set at the user-level API call (jobs cannot override this for their subjobs). Contents of this field override any of the corresponding keys in theexecutionPolicymapping found in individual stages and their executables' run specification (if present). Includes the following optional key/values:restartOnmapping (optional) Indicate a job restart policykey A restartable failure reason (
ExecutionError,UnresponsiveWorker,JMInternalError,AppInternalError,JobTimeoutExceeded, orSpotInstanceInterruption) or*to indicate all restartable failure reasons that are otherwise not present as keysvalue int Maximum number of restarts for the failure reason
maxRestartsint (optional, default 9) Non-negative integer less than 10, indicating the maximum number of times that the job is restartedonNonRestartableFailurestring (optional) If unset, allows the stages to govern their failure propagation behavior. If set, must be either the valuefailStageorfailAllStages, indicating whether the failure of any stage should propagate failure to all other non-terminal stages in the analysis, even if those stages do not have any dependencies on the failed stage. (Stages that have dependencies on the stage that failed still fail irrespective of this setting)
systemRequirementsmapping (optional) Request specific resources for all stages not explicitly specified instageSystemRequirements. Values are merged with stages' stored values as described in the System Requirements section. See the Requesting Instance Types section for more detailsstageSystemRequirementsmapping (optional) Request specific resources by stage. Values are merged with stages' stored values as described in the System Requirements sectionkey Stage ID
value mapping Value to override or merge with the stage's
systemRequirementsvalue
systemRequirementsByExecutablemapping (optional) Request system requirements for all jobs in the resulting execution tree, configurable by executable and by entry point, described in more detail in the Requesting Instance Types section.timeoutPolicyByExecutablemapping (optional) The timeout policies for jobs in the resulting job execution tree, configurable by executable and the entry point within that executable. See thetimeoutPolicyByExecutablefield in /applet-xxxx/run for more details.allowSSHarray of strings (optional, default [ ]) Array of IP addresses or CIDR blocks (up to /16) from which SSH access is allowed to the user by the worker running this job. Array may also include '*' which is interpreted as the IP address of the client issuing this API call as seen by the API server.debugmapping (optional, default { }) Specify debugging options for running the executable. This field is only accepted when this call is made by a user (and not a job)debugOnarray of strings (optional, default [ ]) Array of job errors after which the job's worker should be kept running for debugging purposes, offering a chance to SSH into the worker before worker termination (assuming SSH has been enabled). This option applies to all jobs in the execution tree. Jobs in this state for longer than 2 days are automatically terminated but can be terminated earlier. Allowed entries includeExecutionError,AppError, andAppInternalError.
editVersionint (optional) If provided, run the workflow only if the current version matches the provided value and throw an error if it does not match. If not provided, the current version is run.propertiesmapping (optional) Properties to associate with the resulting analysis.key Property name
value string Property value
tagsarray of strings (optional) Tags to associate with the resulting analysis.singleContextboolean (optional) If true then the resulting jobs and their descendants are only allowed to use the authentication token given to them at the onset. Use of any other authentication token results in an error. This option offers extra security to ensure data cannot leak out of your given context. In restricted projects user-specified value is ignored, andsingleContext: truesetting is used instead.ignoreReusearray of strings (optional) Specifies ids of workflow stages (or "*" for all stages) that ignore job reuse. If a specified stage points to a nested sub-workflow, reuse is ignored recursively by the whole nested sub-workflow. OverridesignoreReusesetting in the workflow and in stage executables.noncestring (optional) Unique identifier for this request. Ensures that even if multiple requests fail and are retried, only a single analysis is created. For more information, see Nonces.detachboolean (optional) This option has no impact when the API is invoked by a user. If invoked from a job with detach set to true, the new analysis is detached from the creator job and appears as a typical root execution. A failure in the detached analysis does not cause a termination of the job from which it was created and vice versa. Detached job inherits neither the access to the workspace of its creator job nor the creator job's priority. Detached analysis' access permissions are the intersection (most restricted) of access permissions of the creator job and the permissions requested by jobs' executables in the detached analysis. To launch the detached analysis, creator job must have CONTRIBUTE or higher access to the project in which the detached job is launched. The billTo of the project in which the creator job is running must have a license to launch detached executions.
rankinteger (optional) An integer between -1024 and 1023, inclusive. The rank indicates the priority in which the executions generated from this executable are processed. The higher the rank, the more prioritized it is. If no rank is provided, the executions default to a rank of zero. If the execution is not a root execution, it inherits its parent's rank. If a rank is provided, all executions relating to the workflow stages also inherit the rank.
costLimitfloat (optional) The limit of the cost that this execution tree should accrue before termination. This field is ignored if this is not a root execution.preserveJobOutputsmapping (optional, default is null). Preserves all cloneable outputs of every completed, non-jobReused job in the execution tree launched by this API call in the root execution project, even if root execution ends up failing. Preserving the job outputs in the project trades off higher costs of storage for the possibility of subsequent job reuse.When a non-jobReused job in the root execution tree launched with non-null
preserveJobOutputsenters "done" state, all cloneable objects referenced by the$dnanexus_linkin the job'soutputfield are cloned to the project folder described bypreserveJobOutputs.folder. This happens unless the output objects already appear elsewhere in the project. Cloneable objects include files, records, applets, and closed workflows, but not databases. If the folder specified bypreserveJobOutputs.folderdoes not exist in the project, the system creates the folder and its parents. As the root job or root analysis' stages complete, the regular outputs of the root execution are moved frompreserveJobOutputs.folderto the regular output folders of the root execution. When you run your root execution without thepreserveJobOutputsoption to completion, some root execution outputs appear in the project in the root execution's output folders. If you had run the same execution withpreserveJobOutputs.folderset to"/pjo_folder", the same set of outputs would appear in the same set of root execution folders as in the first case at completion of the root execution. Some additional job outputs that are not outputs of the root execution would appear in"/pjo_folder".preserveJobOutputsargument can be specified only when starting a root execution or a detached job.preserveJobOutputsvalue, if not null, should be a mapping that may contain the following:key
"folder"string (optional)value
path_to_folderstring (required if "folder" key is specified). Specifies a folder in the root execution project where the outputs of jobs that are a part of the launched execution are stored.path_to_folderstarting with/is interpreted as absolute folder path in the project the job is running in.path_to_foldernot starting with/is interpreted as a path relative to the root execution'sfolderfield. An empty stringpath_to_foldervalue ("") preserves job outputs in the folder described by root execution'sfolderfield. If thepreserveJobOutputsmapping does not have afolderkey, the system uses the default folder value of"intermediateJobOutputs". For example,"preserveJobOutputs": {}is equivalent to"preserveJobOutputs": {"folder":"intermediateJobOutputs"}.It is recommended to place preserveJobOutputs outputs for different root executions into different folders so as not to create a single folder with a large (>450K) number of files.
detailedJobMetricsboolean Requests detailed metrics collection for jobs if set to true. The default value for this flag is projectbillTo'sdetailedJobMetricsCollectDefaultpolicy setting or false if org default is not set. This flag can be specified for root executions and applies to all jobs in the root execution. The list of detailed metrics collected every 60 seconds and viewable for 15 days from the start of a job is available usingdx watch --metrics.
Outputs
idstring ID of the created analysis object, such as "analysis-xxxx".stagesarray of strings List of job IDs that are created for each stage, as ordered in the workflow.
Errors
ResourceNotFound
The specified workflow object, any referenced apps or applets, or project context does not exist
PermissionDenied
VIEW access to the workflow, VIEW access to applets, any apps must be installed
CONTRIBUTE access to the project context required unless called by a job
When specifying
allowSSHordebugoptions, the user must have developer access to all apps in the workflow, or the apps must have theopenSourcefield set to trueIf
preserveJobOutputsis not null andbillToof the project where execution is attempted does not have preserveJobOutputs license.detailedJobMetricssetting of true requires project'sbillToto havedetailedJobMetricslicense feature set to true.app{let}-xxxxcan not run inproject-xxxxbecause executable'shttpsApp.shared_accessshould beNONEto run with isolated browsing.This check applies to all workflow stages.
InvalidInput
The workflow spec is not complete
The project context must be in the same region as this workflow
All data object inputs that are specified directly must be in the same region as the project context.
All inputs that are job-based object references must refer to a job that was run in the same region as the project context.
allowSSHaccepts only IP addresses or CIDR blocks up to /16A
noncewas reused in a request but other inputs had changed signifying a new and different requestA
noncemay not exceed 128 bytesThe
billToof the job's project must be licensed to start detached executions when invoked from the job withdetach: trueargumentpreserveJobOutputsis specified when launching a non-detached execution from a job.preserveJobOutputs.foldervalue is a syntactically invalid path to a folder.detailedJobMetricscan not be specified when launching a non-detached execution from a job.timeoutPolicyByExecutablefor all executables should not benulltimeoutPolicyByExecutablefor all entry points of all executables should not benulltimeoutPolicyByExecutablefor all entry points of all executables should not exceed 30 daysExpected key
timeoutPolicyByExecutable.* of input to match/^(app|applet)-[0-9A-Za-z]{24}$/
InvalidState
editVersionwas provided and does not match the current sorted value)
For InvalidInput errors that result from a mismatch of an applet or app's input specification, an additional field is provided in the error JSON of the form (see documentation for /applet-xxxx/run for more details.
API method: /workflow-xxxx/dryRun
/workflow-xxxx/dryRunSpecification
Perform a dry run of the /workflow-xxxx/run API method.
No new jobs or analyses are created by this method. Any analysis and job IDs returned in the response (except for cached execution IDs) are placeholders and do not represent actual entities in the system.
This method can be used to determine which stages have previous results that would be used. In particular, a stage that would reuse a cached result has a parentAnalysis field (found at stages.N.execution.parentAnalysis where N is the index of the stage) that refers to a preexisting analysis and therefore does not match the top-level field id in the response.
Inputs
Same as would be provided to /workflow-xxxx/run
Outputs
Same as the output if the resulting analysis had been described (see /analysis-xxxx/describe)
Errors
Same as would be thrown if /workflow-xxxx/run had been called with the same input
API method: /workflow-xxxx/validateBatch
/workflow-xxxx/validateBatchSpecification
This API call verifies that a set of input values for a particular workflow can be used to launch a batch of jobs in parallel.
Batch and common inputs:
batchInput: mapping of inputs corresponding to batches. The value at each position in the array corresponds to the execution of the workflow at that position. Including a null value in an array at a given position means that the corresponding workflow input field is optional and the default value, if defined, should be used. E.g.:
{
"stage_0.a": [{$dnanexus_link: "file-xxxx"}, {$dnanexus_link: "file-yyyy"}, ....],
"stage_1.b": [1,null, ...]
}commonInput: mapping of non-batch, constant inputs common to all batch jobs, e.g.:
{
"stage_0.c": "foo"
}File references:
files: list of files (passed as $dnanexus_link references), must be a superset of files included in batchInput and/or commonInput e.g.:
[
{$dnanexus_link: "file-xxxx"},
{$dnanexus_link: "file-yyyy"}
]Output: list of mappings, each mapping corresponds to an expanded batch call. Each mapping contains the input values for the corresponding execution of the workflow, based on its position in the list. E.g.:
[
{"stage_0.a": {$dnanexus_link: "file-xxxx"}, stage_1.b: 1, stage_0.c: "foo"},
{"stage_0.a": {$dnanexus_link: "file-yyyy"}, stage_1.b: null, stage_0.c: "foo"}
]It performs the following validation:
the input types match the expected workflow input field types,
provided inputs are sufficient to run the workflow,
null values are only among values for inputs that are optional or have no specified default values,
all arrays of
batchInputare of equal size,every file referred to in
batchInputsexists infilesinput.
If the workflow is locked, that is, workflow-level inputs are specified for the workflow, this inputs specification is used in place of stage-level inputSpecs, and workflow input field names must be provided in batchInput and commonInput. This happens because for locked workflows input values can only be passed to the workflow-level inputs. For locked workflow we should refer to input fields by their names defined in inputs. To refer to a specific field in a stage of a non-locked workflow, the <stage id>.<input field name defined in inputSpec> format should be used.
Inputs
batchInputmapping Input that the workflow is launched withkey Input field name. It must be one of the names of the inputs defined in the workflow input specification.
value Input field values. It must be an array of fields.
commonInputmapping (optional) Input that the workflow is launched withkey Input field name. It must be one of the names of the inputs defined in the workflow input specification.
value Input field values. It must be an array of fields.
fileslist (optional) Files that are needed to run the batch jobs, they must be provided as$dnanexus_links. They must correspond to all the files included incommonInputorbatchInput.
Outputs
expandedBatchlist of mappings Each mapping contains the input values for one execution of the workflow in batch mode.
Errors
InvalidInput
Input specification must be specified for the workflow
Expected
batchInputto be a JSON objectExpected
commonInputto be a JSON objectExpected
filesto be an array of$dnanexus_linkreferences to filesThe
batchInputfield is required but empty array was providedExpected the value of
batchInputfor an workflow input field to be an arrayExpected the length of all arrays in
batchInputto be equalThe workflow input field value must be specified in
batchInputThe workflow input field is not defined in the input specification of the workflow
All the values of a specific
batchInputfield must be provided (cannot benull) since the field is required and has no default valueExpected all the files in
batchInputandcommonInputto be referenced in thefilesinput array
Analysis API Method Specifications
API method: /analysis-xxxx/describe
/analysis-xxxx/describeSpecification
Describe the specified analysis object.
If the results from previously run jobs are used for any of the stages, they are still listed here. However, the stages' parentAnalysis field still reflects the original analyses in which they were run.
The description of an analysis may not be available if an upstream analysis is not finished running. Users with reorg apps that rely on describing the analysis that is running may want to check the output field dependsOn before the full analysis description becomes available using dx describe analysis-xxx --json | jq -r .dependsOn or equivalent dxpy bindings. The output of the command is an empty array [] if it no longer depends on anything (indicating a status like "done"), which is the signal to proceed. If it contains (sub)analysis IDs, it is not ready, and the reorg script should wait.
Inputs
defaultFieldsboolean (optional, default false iffieldsis supplied, true otherwise) Specifies whether to include the default set of fields in the output (the default fields are described in the "Outputs" section below). The selections are overridden by any fields explicitly named in fields.fields: mapping (optional) include or exclude the specified fields from the output. These selections override the settings indefaultFields.keystring Desired output field. See the "Outputs" section below for valid values herevalueboolean Whether to include the field
Outputs
idstring The object ID, such as "analysis-xxxx".
The following fields are included by default (but can be disabled by setting defaultFields to false or by using the fields input):
classstring The value "analysis"namestring Name of the analysis (either specified at creation time or given automatically by the system)executablestring ID of the workflow or the global workflow that was runexecutableNamestring Name of the workflow or the global workflow that was runcreatedtimestamp Time at which this object was createdmodifiedtimestamp Time at this analysis was last updatedbillTostring ID of the account to which any costs associated with this analysis are billedprojectstring ID of the project in which this analysis was runfolderstring The output folder in which the outputs of this analysis are placedrootExecutionstring ID of the job or analysis at the root of the execution tree (the job or analysis created by a user's API call rather than called by a job or as a stage in an analysis)parentJobstring or null ID of the job which created this analysis, or null if this analysis was not created by a jobparentJobTrynon-negative integer or null.nullis returned if this analysis was not created by a job, or if the parent job had a nulltryattribute. Otherwise, this analysis was created from theparentJobTrytry of theparentJob.parentAnalysisstring or null If this is an analysis that was run as a stage in an analysis, then this is the ID of that analysis. Otherwise, it is nulldetachedFromstring or null The ID of the job this analysis was detached from via thedetachoption, otherwise nulldetachedFromTrynon-negative integer or null. If this analysis was detached from a job,detachedFromanddetachedFromTrydescribe the specific try of the job this analysis was detached from.nullis returned if this analysis was not detached from another job or if thedetachedFromhad anulltryattributeanalysisstring or null Isnullif this analysis was not run as part of a stage in an analysis. Otherwise, the ID of the analysis this analysis is part ofstagestring or null Isnullif this job was not run as part of a stage in an analysis. Otherwise, the ID of the stage this analysis is part ofworkflowmapping Metadata of the workflow that was run, including at least the following fields (analyses created after 8/2014 include the full describe output at the time that the analysis was created):idstring ID of the workflownamestring Name of the workflowinputsarray of mappings Input specification of the workflowoutputsarray of mappings Output specification of the workflowstagesarray of mappings List of metadata for each stage. See description in /workflow-xxxx/describe for more details on what may be returned in each element of the listeditVersionint Edit version at the time of running the workflowinitializedFrommapping If applicable, theinitializedFrommapping from the workflow
stagesarray of mappings List of metadata for each of the stages' executionsidstring Stage IDexecutionmapping with keyidand value of the execution ID. Additional keys are present if the describe hash of the origin job or analysis of the stage has been requested and is available (the fields returned here can be limited by settingfields.stagesin the input to the hash one would give to describe the execution)
statestring The analysis state, one ofin_progress,partially_failed,done,failed,terminating, andterminatedworkspacestring ID of the temporary workspace assigned to the analysis, such as "container-xxxx".launchedBystring ID of the user who launchedrootExecution. This is propagated to all jobs launched by the analysistagsarray of strings Tags associated with the analysispropertiesmapping Properties associated with the analysiskey Property name
value string Property value
detailsarray or mapping The JSON details that were stored with this analysisrunInputmapping The value given asinputin the API call to run the workfloworiginalInputmapping The effective input of the analysis, including all defaults as bound in the stages of the workflow, overridden with any values present inrunInput, and all input field names are translated to their canonical names, such as "<stage ID>.<field name>"inputmapping The same asoriginalInputoutputmapping or null Isnullif no stages have finished. Otherwise, contains key/value pairs for all outputs that are available (final only whenstateis one ofdone,terminated, andfailed)delayWorkspaceDestructionboolean Whether the analysis's temporary workspace is kept around for 3 days after the analysis either succeeds or failsignoreReusearray of strings, or null analysis stage ids (or "*" for all stages) that were configured to ignore job reuse.preserveJobOutputsnull or a mapping withpreserveJobOutputs.folderexpanded to start with"/".detailedJobMetricsboolean Set to true only if the detailed job metrics collection was enabled for this analysis.costLimitfloat If the job is a root execution, and has the root execution cost limit, this is the cost limit for the root execution.rankint The rank of the analysis, with a range from [-1024 to 1023].
If this job is a root execution, the following fields are included by default (but can be disabled using fields):
selectedTreeTurnaroundTimeThresholdinteger or null The selected turnaround time threshold (in seconds) for this root execution. WhentreeTurnaroundTimereaches theselectedTreeTurnaroundTimeThreshold, the system sends an email about this root execution to thelaunchedByuser and thebillToprofile.selectedTreeTurnaroundTimeThresholdFromstring or null WhereselectedTreeTurnaroundTimeThresholdis from.executablemeans thatselectedTreeTurnaroundTimeThresholdis from this root execution's executable'streeTurnaroundTimeThreshold.systemmeans thatselectedTreeTurnaroundTimeThresholdis from the system's default threshold.treeTurnaroundTimeinteger The turnaround time (in seconds) of this root execution, which is the time between its creation time and its terminal-state time (or the current time if it is not in a terminal state. Terminal states for an execution include done, terminated, and failed. See Job Lifecyclefor information on them). If this root execution can be retried, the turnaround time begins at the creation time of the root execution's first try, so it includes the turnaround times of all tries.
If the requesting user has permissions to view the pricing model of the billTo of the analysis, and the price for the analysis has been finalized:
currencymapping Information about currency settings, such asdxCode,code,symbol,symbolPosition,decimalSymbol, andgroupingSymbol.totalPricenumber Price (incurrency) for how much this job (along with all its subjobs) costs.priceComputedAttimestamp Time at whichtotalPricewas computed. For billing purposes, the cost of the analysis accrues to the invoice of the month that containspriceComputedAt(in UTC).totalEgressmapping Egress (inByte) for how much data amount this job (along with all its subjobs) has egressed.regionLocalEgressint Amount in bytes of data transfer between IP in the same cloud region.internetEgressint Amount in bytes of data transfer to IP outside of the cloud provider.interRegionEgressint Amount in bytes of data transfer to IP in other regions of the cloud provider.
egressComputedAttimestamp Time at whichtotalEgresswas computed. For billing purposes, the cost of the analysis accrues to the invoice of the month that contains egressComputedAt (in UTC).
The following field is only returned if the corresponding field in the fields input is set to true, the requesting user has permissions to view the pricing model of the billTo of the job, and the job is a root execution:
subtotalPriceInfomapping Information about the current costs associated with all jobs in the tree rooted at this analysissubtotalPricenumber Current cost (incurrency) of the job tree rooted at this analysispriceComputedAttimestamp Time at whichsubtotalPricewas computed
subtotalEgressInfomapping Information about the aggregated egress amount in bytes associated with all jobs in the tree rooted at this analysissubtotalRegionLocalEgressint Amount in bytes of data transfer between IP in the same cloud region.subtotalInternetEgressint Amount in bytes of data transfer to IP outside of the cloud provider.subtotalInterRegionEgressint Amount in bytes of data transfer to IP in other regions of the cloud provider.egressComputedAttimestamp Time at whichsubtotalEgresswas computed
The following fields are returned if the corresponding field in the fields input is set to true:
runSystemRequirementsmapping or null A mapping with thesystemRequirementsvalues that were passed explicitly to/globalworkflow-xxxx/runor/workflow-xxxx/runwhen this analysis was created, ornullif thesystemRequirementsinput was not supplied the API call that created this analysis.runStageSystemRequirementsmapping or null Similar torunSystemRequirementsbut forstageSystemRequirements.runSystemRequirementsByExecutablemapping or null Similar torunSystemRequirementsbut forsystemRequirementsByExecutable.mergedSystemRequirementsByExecutablemapping or null A mapping with values ofsystemRequirementsByExecutablesupplied to all the ancestors of this analysis and the value supplied to create this analysis, merged as described in the Requesting Instance Types section. If neither the ancestors of this analysis nor this analysis itself were created with thesystemRequirementsByExecutableinput,mergedSystemRequirementsByExecutablevalue ofnullis returned.
Errors
ResourceNotFound
The specified object does not exist
PermissionDenied
User does not have VIEW access to the analysis's project context
InvalidInput
Input is not a hash
fields(if present) is not a hash or has a non-boolean key (other thanstages)fieldshas the keystagesand is not a boolean nor a hash
API method: /analysis-xxxx/addTags
/analysis-xxxx/addTagsSpecification
Adds the specified tags to the specified analysis. If any of the tags are already present, no action is taken for those tags.
Inputs
tagsarray of strings Tags to be added
Outputs
idstring ID of the manipulated analysis
Errors
InvalidInput
The input is not a hash
The key
tagsis missing, or its value is not an array, or the array contains at least one invalid (not a string of nonzero length) tagResourceNotFound
The specified analysis does not exist
PermissionDenied
CONTRIBUTE access is required for the analysis's project context. Otherwise, the request can also be made by jobs sharing the same workspace as the parent job of the specified analysis
API method: /analysis-xxxx/removeTags
/analysis-xxxx/removeTagsSpecification
Removes the specified tags from the specified analysis. Ensures that the specified tags are not part of the analysis -- if any of the tags are already missing, no action is taken for those tags.
Inputs
tagsarray of strings Tags to be removed
Outputs
idstring ID of the manipulated analysis
Errors
InvalidInput
The input is not a hash
The key
tagsis missing, or its value is not an array, or the array contains at least one invalid (not a string of nonzero length) tag
ResourceNotFound
The specified analysis does not exist
PermissionDenied
CONTRIBUTE access is required for the analysis's project context. Otherwise, the request can also be made by jobs sharing the same workspace as the parent job of the specified analysis
API method: /analysis-xxxx/setProperties
/analysis-xxxx/setPropertiesSpecification
Sets properties on the specified analysis. To remove a property altogether, its value needs to be set to the JSON null (instead of a string). This call updates the properties of the analysis by merging any old (previously existing) ones with what is provided in the input, the newer ones taking precedence when the same key appears in the old.
To reset properties, you need to remove all existing key/value pairs and replace them with new ones. First, issue a describe call to get the names of all properties. Then issue a setProperties request to set the values of those properties to null.
Inputs
propertiesmapping Properties to modifykey Name of property to modify
value string or null Either a new string value for the property, or null to unset the property
Outputs
idstring ID of the manipulated analysis
Errors
InvalidInput
There exists at least one value in
propertieswhich is neither a string nor the JSON nullFor each property key-value pair, the size, encoded in UTF-8, of the property key may not exceed 100 bytes and the property value may not exceed 700 bytes
ResourceNotFound
The specified analysis does not exist
PermissionDenied
CONTRIBUTE access is required for the analysis's project context. Otherwise, the request can also be made by jobs sharing the same workspace as the parent job of the specified analysis
API method: /analysis-xxxx/terminate
/analysis-xxxx/terminateSpecification
Terminates an analysis and the stages' origin jobs and/or analyses. This call is only valid from outside the platform.
Analyses can only be terminated by the user who launched the analysis and has at least CONTRIBUTE access or by any user with ADMINISTER access to the project context.
Inputs
None
Outputs
idstring ID of the terminated analysis, such as "analysis-xxxx".
Errors
ResourceNotFound
The specified object does not exist
PermissionDenied
ADMINISTER access required to the project context of the job or else the user must match the
launchedByentry of the analysis object
InvalidState
The analysis is not in a state from which it can be terminated, for example, it is in a terminal state
API method: /analysis-xxxx/update
/analysis-xxxx/updateSpecification
Updates an analysis and its stages' jobs and/or analyses. This call is only valid from outside the platform. You can only update the rank of root analyses.
A valid rank field must be provided. To update rank, the organization associated with this analysis must have the license feature executionRankEnabled active. The user must also be either the original launcher of the analysis or an administrator of the organization.
When supplying rank, the job or analysis being updated must be a rootExecution, and must be in a state capable of creating more jobs. rank cannot be supplied for terminal states like terminated, done, failed, or debug_hold.
Inputs
rankinteger The rank to set the analysis and its children executions to.
Outputs
idstring ID of the updated analysis, such as "analysis-xxxx".
Errors
InvalidInput
Input is not a hash
Expected input to have property
rankExpected key
rankof input to be an integerExpected key
rankof input to be in range [-1024, 1023]Not a root execution
PermissionDenied
billTodoes not have license feature executionRankEnabledNot permitted to change rank
ResourceNotFound
The specified object does not exist
Last updated
Was this helpful?