Monitoring Executions

Learn how to get information on current and past executions via both the UI and the CLI.

Monitoring an Execution via the UI

Getting Basic Information on an Execution

To get basic information on a job (the execution of an app or applet) or an analysis (the execution of a workflow):

Click on Projects in the main Platform menu.
On the Projects list page, find and click on the name of the project within which the execution was launched.
Click on the Monitor tab to open the Monitor screen.
The Monitor screen shows a list of executions launched within the project. By default, executions appear in reverse chronological order, with the most recently launched execution at the top.
Find the row displaying information on the execution.
- For an analysis (the execution of a workflow), click the "+" icon to the left of the analysis name to expand the row and view information on its stages. For executions with further descendants, click the "+" icon next to the name to expand the row and show additional details.
To see additional information on an execution, click on its name to be taken to its details page.
- The following shortcuts allow you to view information from the details page directly on the list page, or relaunch an execution:
  - To view the Info pane:
    Click the Info icon, above the right edge of the executions list, if it's not already selected, and then select the execution by clicking on the row.
    Hover over the row and click on the "More Actions" button that looks like three vertical dots at the end of the row to select View Info in the fly out menu.
  - To view the log file for a job, do either of the following:
    Select the execution by clicking on the row. When a View Log button appears in the header, click it,
    Hover over the row and click on the "More Actions" button that looks like three vertical dots at the end of the row to select View Log in the fly out menu.
  - To re-launch a job, do either of the following:
    Select the execution by clicking on the row. When a Launch as New Job button appears in the header, click it.
    Hover over the row and click on the "More Actions" button that looks like three vertical dots at the end of the row, then select Launch as New Job in the menu.
  - To re-launch an analysis, do either of the following:
    Select the execution by clicking on the row. When a Launch as New Analysis button appears in the header, click it.
    Hover over the row and click on the "More Actions" button that looks like three vertical dots at the end of the row to select Launch as New Analysis in the menu.

Available Basic Information on Executions

The list on the Monitor screen displays the following information for each execution that is running or has been run within the project:

Name - The default name for an execution is the name of the app, applet, or workflow being run. When configuring an execution, you can give it a custom name, either via the UI, or via the CLI. The execution's name is used in Platform email alerts related to the execution. Clicking on a name in the executions list opens the execution details page, giving in-depth information on the execution.
State - This is the execution's state. State values include:
- "Waiting" - The execution awaits Platform resource allocation or completion of dependent executions.
- "Running" - The job is actively executing.
- "In Progress" - The analysis is actively processing.
- "Done" - The execution completed successfully without errors.
- "Failed" - The execution encountered an error and could not complete. See Types of Errors for troubleshooting assistance.
- "Partially Failed" - An analysis reaches "Partially Failed" state if one or more workflow stages did not finish successfully, with at least one stage not in a terminal state (either "Done," "Failed," or "Terminated").
- "Terminating" - The worker has initiated but not completed the termination process.
- "Terminated" - The execution stopped before completion.
- "Debug Hold" - The execution, run with debugging options, encountered an applicable failure and entered debugging hold.
Executable - The executable or executables run during the execution. If the execution is an analysis, each stage appears in a separate row, including the name of the executable run during the stage. If an informational page exists with details about the executable's configuration and use, the executable name becomes clickable, and clicking displays that page.
Tags - Tags are strings associated with objects on the platform. They are a type of metadata that can be added to an execution.
Launched By - The name of the user who launched the execution.
Launched On - The time at which the execution was launched. This time often precedes the time in the Started Running column due to executions waiting for available resources before starting.
Started Running - The time at which the execution started running, if it has done so. This is not always the same as its launch time, if it requires time waiting for available resources before starting.
Duration - For jobs, this figure represents the time elapsed since the job entered the running state. For analyses, it represents the time elapsed since the analysis was created.
Cost - A value is displayed in this column when the user has access to billing info for the execution. The figure shown represents either, for a running execution, an estimate of the charges it has incurred so far, or, for a completed execution, the total costs it incurred.
Priority - The priority assigned to the execution - either "low," "normal," or "high" - when it was configured, either via the CLI or via the UI. This setting determines the scheduling priority of the execution relative to other executions that are waiting to be launched.
Worker URL - If the execution runs an executable, such as DXJupyterLab, with direct web URL connection capability, the URL appears here. Clicking the URL opens a connection to the executable in a new browser tab.
Output Folder - For each execution, the value shows a path relative to the project's root folder. Click the value to open the folder containing the execution's outputs.

Additional Basic Information

Additional basic information can be displayed for each execution. To do this:

Click on the "table" icon at the right edge of the table header row.
Select one or more of the entries in the list, to display an additional column or columns.

Available additional columns include:

Stopped Running - The time at which the execution stopped running.
Custom properties columns - If a custom property or properties have been assigned to any of the listed executions, a column can be added to the table, for each such property, showing the values assigned to each execution, for that property.

Customizing the Executions List Display

To remove columns from the list, click on the "table" icon at the right edge of the table header row, then de-select one or more of the entries in the list, to hide the column or columns.

Filtering the Executions List

A filter menu above the executions list allows you to run a search that refines the list to display only executions meeting specific criteria.

By default, pills are available to set search criteria for filtering executions by one or more of these attributes:

Name - Execution name
State - Execution state
ID - An execution's job ID or analysis ID
Executable - A specific executable
Launched By - The user who launched an execution or executions
Launch Time - The time range within which executions were launched

Click the List icon, above the right edge of the executions list, to display pills that allow filtering by additional execution attributes.

Search Scope

By default, filters are set to display only root executions that meet the criteria defined in the filter. To include all executions, including those run during individual stages of workflows, click the button above the left edge of the executions list showing the default value "Root Executions Only," then click "All Executions."

Saving and Reusing Filters

To save a particular filter, click the Bookmark icon, above the right edge of the executions list, assign your filter a name, then click Save.

To apply a saved filter to the executions list, click the Bookmark icon, then select the filter from the list.

Terminating an Execution from the Monitor Screen

If you launched an execution or have contributor access to the project in which the execution is running, you can terminate the execution from the list on the Monitor screen when it is in a non-terminal state. You can also terminate executions launched by other project members if you have project admin status.

To terminate an execution:

Find the execution in the list:
- Select the execution by clicking on the row. Click the red Terminate button that appears at the end of the header.
- Hover over the row and click on the "More Actions" button that looks like three vertical dots at the end of the row to select Terminate in the menu.
A modal window opens, asking you to confirm the termination. Click Terminate to confirm.
The execution's state changes to "Terminating" during termination, then to "Terminated" once complete.

Getting Detailed Information on an Execution via the UI

For additional information about an execution, click its name in the list on the Monitor screen to open its details page.

Available Detailed Information on Executions

The details page for an execution displays a range of information, including:

High-level details - The high-level information in this section includes:
- For a standalone execution - such as a job without children - the display shows a single entry with details about the execution state, start and stop times, and duration in the running state.
- For an execution with descendants - such as an analysis with multiple stages - the display shows a list with each row containing details about stage executions. For executions with descendants, click the "+" icon next to the name to expand the row and view descendant information. A page displaying detailed information on a stage appears when clicking on its name in the list. To navigate back to the workflow's details page, click its name in the "breadcrumb" navigation menu in the top right corner of the screen.
Execution state - In the Execution Tree section, each execution row includes a color bar that represents the execution's current state. For descendants within the same execution tree, the time visualizations are staggered, indicating their different start and stop times compared to each other. The colors include:
- Blue - A blue bar indicates that the execution is in the "Running" or "In Progress" state.
- Green - A green bar indicates that the execution is in the "Done" state.
- Red - A red bar indicates that the execution is in the "Failed" or "Partially Failed" state.
- "Grey" indicates that the execution is in the "Terminated" state.
Execution start and stop times - Times are displayed in the header bar at the top of the Execution Tree section. These times run, from left to right, from the time at which the job started running, or when the analysis was created, to either the current time, or the time at which the execution entered a terminal state ("Done," "Failed," or "Terminated").
Inputs - This section lists the execution inputs. Available input files appear as hyperlinks to their project locations. For inputs from other workflow executions, the source execution name appears as a hyperlink to its details page.
Outputs - This section lists the execution's outputs. Available output files appear as hyperlinks. Click a link to open the folder containing the output file.
Log files - An execution's log file is useful in understanding details about, for example, the resources used by an execution, the costs it incurred, and the source of any delays it encountered. To access log files, and, as needed, download them in .txt format:
- To access the log file for a job, click either the View Log button in the top right corner of the screen, or the View Log link in the Execution Tree section.
- To access the log file for each stage in an analysis, click the View Log link next to the row displaying information on the stage, in the Execution Tree section.
Basic info - The Info pane, on the right side of the screen, displays a range of basic information on the execution, along with additional detail such as the execution's unique ID, and custom properties and tags assigned to it.
Reused results - For executions reusing results from another execution, the information appears in a blue pane above the Execution Tree section. Click the source execution's name to see details about the execution that generated these results.

Getting Help with Failed Executions

For failed executions, a Cause of Failure pane appears above the Execution Tree section. The cause of failure is a system-generated error message. For assistance in diagnosing the failure and any related issues:

Click the button labeled Send Failure Report to DNAnexus Support.
A form opens in a modal window, with pre-populated Subject and Message fields containing diagnostic information for DNAnexus Support.
Click the button in the Grant Access section to grant DNAnexus Support "View" access to the project, enabling faster issue diagnosis and resolution.
Click Send Report to send the report.

Launching a New Execution

To re-launch a job from the execution details screen:

Click the Launch as New Job button in the upper right corner of the screen.
A new browser tab opens, displaying the Run App / Applet form.
Configure the run, then click Start Analysis.

To re-launch an analysis from the execution details screen:

Click the Launch as New Analysis button in the upper right corner of the screen.
A new browser tab opens, displaying the Run Analysis form.
Configure the run, then click Start Analysis.

Saving a Workflow as a New Workflow

To save a copy of a workflow along with its input configurations under a new name from the execution details screen:

Click the Save as New Workflow button in the upper right corner of the screen.
In the Save as New Workflow modal window, give the workflow a name, and select the project in which you'd like to save it.
Click Save.

Viewing Initial Tries for Restarted Jobs

As described in job states, jobs can be configured to restart automatically on certain types of failures.

If you want to view the execution details for the initial tries for a restarted job:

Click on the "Tries" link below the job name in the summary banner, or the "Tries" link next to the job name in the execution tree.
A modal window opens.
Click the name of the try for which you'd like to view execution details.

You can only send a failure report for the most recent try, not for any previous tries.

Monitoring a Job via the CLI

You can use dx watch to view the log of a running job or any past jobs, which may have finished successfully, failed, or been terminated.

Monitoring a Running Job

Use dx watch to view a job's log stream during execution. The log stream includes stdout, stderr, and additional worker output information.

$ dx watch job-xxxx
Watching job job-xxxx. Press Ctrl+C to stop.
* Sample Prints (sample_prints:main) (running) job-xxxx
  amy 2024-01-01 09:00:00 (running for 0:00:37)
2024-01-01 09:06:00 Sample Prints INFO Logging initialized (priority)
2024-01-01 09:06:37 Sample Prints INFO CPU: 4% (4 cores) * Memory: 547/7479MB * Storage: 74GB free * Net: 0↓/0↑MBps
2024-01-01 09:06:37 Sample Prints INFO Setting SSH public key
2024-01-01 09:06:37 Sample Prints STDOUT dxpy/0.365.0 (Linux-5.15.0-1050-aws-x86_64-with-glibc2.29) Python/3.8.10
2024-01-01 09:06:37 Sample Prints STDOUT Invoking main with {}
2024-01-01 09:06:37 Sample Prints STDOUT 0
...

Terminating a Job

To terminate a job before completion, use the command dx terminate.

Monitoring Past Jobs

Use the dx watch command to view completed jobs. The log stream includes stdout, stderr, and additional worker output information from the execution.

$ dx watch job-xxxx
Watching job job-xxxx. Press Ctrl+C to stop.
* Sample Prints (sample_prints:main) (running) job-xxxx
  amy 2024-01-01 09:00:00 (running for 0:00:37)
2024-01-01 09:06:00 Sample Prints INFO Logging initialized (priority)
2024-01-01 09:06:37 Sample Prints INFO CPU: 4% (4 cores) * Memory: 547/7479MB * Storage: 74GB free * Net: 0↓/0↑MBps
2024-01-01 09:06:37 Sample Prints INFO Setting SSH public key
204-01-01 09:06:37 Sample Prints STDOUT dxpy/0.365.0 (Linux-5.15.0-1050-aws-x86_64-with-glibc2.29) Python/3.8.10
2024-01-01 09:06:37 Sample Prints STDOUT Invoking main with {}
2024-01-01 09:06:37 Sample Prints STDOUT 0
2024-01-01 09:06:37 Sample Prints STDOUT 1
2024-01-01 09:06:37 Sample Prints STDOUT 2
2024-01-01 09:06:37 Sample Prints STDOUT 3
* Sample Prints (sample_prints:main) (done) job-xxxx
  amy 2024-01-01 09:08:11 (runtime 0:02:11)
  Output: -

Finding Executions via the CLI

Use dx find executions to display the ten most recent executions in your current project. Specify a different number of executions by using dx find executions -n <specified number>. The output matches the information shown in the "Monitor" tab on the DNAnexus web UI.

Below is an example of dx find executions. In this case, only two executions have been run in the current project. An individual job, DeepVariant Germline Variant Caller, and a workflow consisting of two stages, Variant Calling Workflow, are shown. A stage is represented by either another analysis (if running a workflow) or a job (if running an app(let)).

The job running the DeepVariant Germline Variant Caller executable is running and has been running for 10 minutes and 28 seconds. The analysis running the Variant Calling Workflow consists of 2 stages, FreeBayes Variant Caller, which is waiting on input, and BWA-MEM FASTQ Read Mapper, which has been running for 10 minutes and 18 seconds.

$ dx find executions
* DeepVariant Germline Variant Caller (deepvariant_germline:main) (running) job-xxxx
  amy 2024-01-01 09:00:18 (running for 0:10:28)
* Variant Calling Workflow (in_progress) analysis-xxxx
│ amy 2024-01-01 09:00:18
├── * FreeBayes Variant Caller (freebayes:main) (waiting_on_input) job-yyyy
│     amy 2024-01-01 09:00:18
└── * BWA-MEM FASTQ Read Mapper (bwa_mem_fastq_read_mapper:main) (running) job-zzzz
      amy 2024-01-01 09:00:18 (running for 0:10:18)

Using `dx` find executions

The dx find executions operation searches for jobs or analyses created when a user runs an app or applet. For jobs that are part of an analysis, the results appear in a tree representation linking related jobs together.

By default, dx find executions displays up to ten of the most recent executions in your current project, ordered by creation time.

Filter executions by job type using command flags: --origin-jobs shows only original jobs, while --all-jobs includes both original jobs and subjobs.

Finding Analyses via the CLI

You can monitor analyses by using the command dx find analyses, which displays the top-level analyses, excluding contained jobs. Analyses are executions of workflows and consist of one or more app(let)s being run.

Below is an example of dx find analyses:

$ dx find analyses
* Variant Calling Workflow (in_progress) analysis-xxxx
  amy 2024-01-01 09:00:18

Finding Jobs via the CLI

Jobs are runs of an individual app(let) and compose analyses. Monitor jobs using the command dx find jobs to display a flat list of jobs. For jobs within an analysis, the command returns all jobs in that analysis.

Below is an example of dx find jobs:

$ dx find jobs
* DeepVariant Germline Variant Caller (deepvariant_germline:main) (running) job-xxxx
  amy 2024-01-01 09:10:00 (running for 0:00:28)
* FreeBayes Variant Caller (freebayes:main) (waiting_on_input) job-yyyy
  amy 2024-01-01 09:00:18
* BWA-MEM FASTQ Read Mapper (bwa_mem_fastq_read_mapper:main) (running) job-zzzz
  amy 2024-01-01 09:00:18 (running for 0:10:18)

Advanced CLI Monitoring Options

Searches for executions can be restricted to specific parameters.

Viewing `stdout` and/or `stderr` from a Job Log

To extract stdout only from this job, run the command dx watch job-xxxx --get-stdout.
To extract stderr only from this job, run the command dx watch job-xxxx --get-stderr.
To extract both stdout and stderr from this job, run the command dx watch job-xxxx --get-streams.

Below is an example of viewing stdout lines of a job log:

$ dx watch job-xxxx --get-streams
Watching job job-xxxx. Press Ctrl+C to stop.
dxpy/0.365.0 (Linux-5.15.0-1050-aws-x86_64-with-glibc2.29) Python/3.8.10
Invoking main with {}
0
1
2
3
4
5
6
7
8
9
10

Viewing Subjobs

To view the entire job tree, including both main jobs and subjobs, use the command dx watch job-xxxx --tree.

Viewing the First n Messages of a Job Log

To view the entire job tree -- both main jobs and subjobs -- use the command dx watch job-xxxx -n 8. If the job already ran, the output is displayed as well.

In the example below, the app Sample Prints doesn't have any output.

$ dx watch job-F5vPQg807yxPJ3KP16Ff1zyG -n 8
Watching job job-xxxx. Press Ctrl+C to stop.
* Sample Prints (sample_prints:main) (done) job-xxxx
  amy 2024-01-01 09:00:00 (runtime 0:02:11)
2024-01-01 09:06:00 Sample Prints INFO Logging initialized (priority)
2024-01-01 09:08:11 Sample Prints INFO CPU: 4% (4 cores) * Memory: 547/7479MB * Storage: 74GB free * Net: 0↓/0↑MBps
2024-01-01 09:08:11 Sample Prints INFO Setting SSH public key
2024-01-01 09:08:11 Sample Prints dxpy/0.365.0 (Linux-5.15.0-1050-aws-x86_64-with-glibc2.29) Python/3.8.10
* Sample Prints (sample_prints:main) (done) job-F5vPQg807yxPJ3KP16Ff1zyG
  amy 2024-01-01 09:00:00 (runtime 0:02:11)
  Output: -

Finding and Examining Initial Tries for Restarted Jobs

Jobs can be configured to restart automatically on certain types of failures as described in the Restartable Jobs section. To view initial tries of the restarted jobs along with execution subtrees rooted in those initial tries, use dx find executions --include-restarted. To examine job logs for initial tries, use dx watch job-xxxx --try X. An example of these commands is shown below.

$ dx run swiss-army-knife -icmd="exit 1" \
    --extra-args '{"executionPolicy": { "restartOn":{"*":2}}}'

$ dx find executions --include-restarted
* Swiss Army Knife (swiss-army-knife:main) (failed) job-xxxx tries
├── * Swiss Army Knife (swiss-army-knife:main) (failed) job-xxxx try 2
│     amy 2023-08-02 16:33:40 (runtime 0:01:45)
├── * Swiss Army Knife (swiss-army-knife:main) (restarted) job-xxxx try 1
│     amy 2023-08-02 16:33:40
└── * Swiss Army Knife (swiss-army-knife:main) (restarted) job-xxxx try 0
      amy 2023-08-02 16:33:40

$ dx watch job-xxxx --try 0
Watching job job-xxxx try 0. Press Ctrl+C to stop watching.
* Swiss Army Knife (swiss-army-knife:main) (restarted) job-xxxx try 0
  amy 2023-08-02 16:33:40
2023-08-02 16:35:26 Swiss Army Knife INFO Logging initialized (priority)

Searching Across All Projects

By default, dx find restricts searches to your current project context. Use the --all-projects flag to search across all accessible projects.

$ dx find executions -n 3 --all-projects
* Sample Prints (sample_prints:main) (done) job-xxxx
  amy 2024-01-01 09:15:00 (runtime 0:02:11)
* Sample Applet (sample_applet:main) (done) job-yyyy
  ben 2024-01-01 09:10:00 (runtime 0:00:28)
* Sample Applet (sample_applet:main) (failed) job-zzzz
  amy 2024-01-01 09:00:00 (runtime 0:19:02)

Returning More Than Ten Results

By default, dx find returns up to ten of the most recently launched executions matching your search query. Use the -n option to change the number of executions returned.

# Find the 100 most recently launched jobs in your project
$ dx find executions -n 100

Searching by Executable

A user can search for only executions of a specific app(let) or workflow based on its entity ID.

# Find most recent executions running app-deepvariant_germline in the current project
$ dx find executions --executable app-deepvariant_germline
* DeepVariant Germline Variant Caller (deepvariant_germline:main) (running) job-xxxx
  amy 2024-01-01 09:00:18 (running for 0:10:18)

Searching by Execution Start Time

Users can also use the --created-before and --created-after options to search based on when the execution began.

Searching by Date

# Find executions run on January 2, 2024
$ dx find executions --created-after=2024-01-01 --created-before=2024-01-03

Searching by Time

# Find executions created in the last 2 hours
$ dx find executions --created-after=-2h

# Find analyses created in the last 5 days
$ dx find analyses --created-after=-5d

Searching by Execution State

Users can also restrict the search to a specific state, for example, "done", "failed", "terminated".

# Find failed jobs in the current project
$ dx find jobs --state failed

Scripting

Delimiters

The --delim flag produces tab-delimited output, suitable for processing by other shell commands.

$ dx find jobs --delim
* Cloud Workstation (cloud_workstation:main) done  job-xxxx    amy   2024-01-07 09:00:00 (runtime 1:00:00)
* GATK3 Human Exome Pipeline(gatk3_human_exome_pipeline:main)    done  job-yyyy amy 2024-01-07  09:00:00 (runtime 0:21:16)

Returning Only IDs

Use the --brief flag to display only the object IDs for objects returned by your search query. The ‑‑origin‑jobs flag excludes subjob information.

Below is an example usage of the --brief flag:

$ dx find jobs -n 3 --brief
job-xxxx
job-yyyy
job-zzzz

Below is an example of using the flags --origin-jobs and --brief. In the example below, the last job run in the current default project is described.

$ dx describe $(dx find jobs -n 1 --origin-jobs --brief)
Result 1:
ID                  job-xxxx
Class               job
Job name            BWA-MEM FASTQ Read Mapper
Executable name     bwa_mem_fastq_read_mapper
Project context     project-xxxx
Billed to           amy
Workspace           container-xxxx
Cache workspace     container-yyyy
Resources           container-zzzz
App                 app-xxxx
Instance Type       mem1_ssd1_x8
Priority            high
State               done
Root execution      job-zzzz
Origin job          job-zzzz
Parent job          -
Function            main
Input               genomeindex_targz = file-xxxx
                reads_fastqgz = file-xxxx
                [read_group_library = "1"]
                [mark_as_secondary = true]
                [read_group_platform = "ILLUMINA"]
                [read_group_sample = "1"]
                [add_read_group = true]
                [read_group_id = {"$dnanexus_link": {"input": "reads_fastqgz", "metadata": "name"}}]
                [read_group_platform_unit = "None"]
Output              -
Output folder       /
Launched by         amy
Created             Sun Jan  1 09:00:17 2024
Started running     Sun Jan  1 09:00:10 2024
Stopped running     Sun Jan  1 09:00:27 2024 (Runtime: 0:00:16)
Last modified       Sun Jan  1 09:00:28 2024
Depends on          -
Sys Requirements    {"main": {"instanceType": "mem1_ssd1_x8"}}
Tags                -
Properties          -

Rerunning Time-Specific Failed Jobs With Updated Instance Types

# Find failed jobs in the current project from a time period
$ dx find jobs --state failed --created-after=2024-01-01 --created-before=2024-02-01
* BWA-MEM FASTQ Read Mapper (bwa_mem_fastq_read_mapper:main) (failed) job-xxxx
  amy 2024-01-22 09:00:00 (runtime 0:02:12)
* BWA-MEM FASTQ Read Mapper (bwa_mem_fastq_read_mapper:main) (done) job-yyyy
  amy 2024-01-07 06:00:00 (runtime 0:11:22)

Rerunning Failed Executions With an Updated Executable

# Find all failed executions of specified executable
$ dx find executions --state failed --executable app-bwa_mem_fastq_read_mapper
* BWA-MEM FASTQ Read Mapper (bwa_mem_fastq_read_mapper:main) (failed) job-xxxx
  amy 2024-01-01 09:00:00 (runtime 0:02:12)

# Update the app and navigate to within app directory
$ dx build -a
INFO:dxpy:Archived app app-xxxx to project-xxxx:"/.App_archive/bwa_mem_fastq_read_mapper (Sun Jan  1 09:00:00 2024)"
{"id": "app-yyyy"}

# Rerun job with updated app
dx run bwa_mem_fastq_read_mapper --clone job-xxxx

dx find jobs --tag TAG

See more on using dx find jobs.

Forwarding Job Logs to Splunk for Analysis

A license is required to use this feature. Contact DNAnexus Sales for more information.

Job logs can be automatically forwarded to a customer's Splunk instance for analysis.

Last updated 5 days ago

Was this helpful?