DNAnexus Documentation
APIDownloadsIndex of dx CommandsLegal
  • Overview
  • Getting Started
    • DNAnexus Essentials
    • Key Concepts
      • Projects
      • Organizations
      • Apps and Workflows
    • User Interface Quickstart
    • Command Line Quickstart
    • Developer Quickstart
    • Developer Tutorials
      • Bash
        • Bash Helpers
        • Distributed by Chr (sh)
        • Distributed by Region (sh)
        • SAMtools count
        • TensorBoard Example Web App
        • Git Dependency
        • Mkfifo and dx cat
        • Parallel by Region (sh)
        • Parallel xargs by Chr
        • Precompiled Binary
        • R Shiny Example Web App
      • Python
        • Dash Example Web App
        • Distributed by Region (py)
        • Parallel by Chr (py)
        • Parallel by Region (py)
        • Pysam
      • Web App(let) Tutorials
        • Dash Example Web App
        • TensorBoard Example Web App
      • Concurrent Computing Tutorials
        • Distributed
          • Distributed by Region (sh)
          • Distributed by Chr (sh)
          • Distributed by Region (py)
        • Parallel
          • Parallel by Chr (py)
          • Parallel by Region (py)
          • Parallel by Region (sh)
          • Parallel xargs by Chr
  • User
    • Login and Logout
    • Projects
      • Project Navigation
      • Path Resolution
    • Running Apps and Workflows
      • Running Apps and Applets
      • Running Workflows
      • Running Nextflow Pipelines
      • Running Batch Jobs
      • Monitoring Executions
      • Job Notifications
      • Job Lifecycle
      • Executions and Time Limits
      • Executions and Cost and Spending Limits
      • Smart Reuse (Job Reuse)
      • Apps and Workflows Glossary
      • Tools List
    • Cohort Browser
      • Chart Types
        • Row Chart
        • Histogram
        • Box Plot
        • List View
        • Grouped Box Plot
        • Stacked Row Chart
        • Scatter Plot
        • Kaplan-Meier Survival Curve
      • Locus Details Page
    • Using DXJupyterLab
      • DXJupyterLab Quickstart
      • Running DXJupyterLab
        • FreeSurfer in DXJupyterLab
      • Spark Cluster-Enabled DXJupyterLab
        • Exploring and Querying Datasets
      • Stata in DXJupyterLab
      • Running Older Versions of DXJupyterLab
      • DXJupyterLab Reference
    • Using Spark
      • Apollo Apps
      • Connect to Thrift
      • Example Applications
        • CSV Loader
        • SQL Runner
        • VCF Loader
      • VCF Preprocessing
    • Environment Variables
    • Objects
      • Describing Data Objects
      • Searching Data Objects
      • Visualizing Data
      • Filtering Objects and Jobs
      • Archiving Files
      • Relational Database Clusters
      • Symlinks
      • Uploading and Downloading Files
        • Small File Sets
          • dx upload
          • dx download
        • Batch
          • Upload Agent
          • Download Agent
    • Platform IDs
    • Organization Member Guide
    • Index of dx commands
  • Developer
    • Developing Portable Pipelines
      • dxCompiler
    • Cloud Workstation
    • Apps
      • Introduction to Building Apps
      • App Build Process
      • Advanced Applet Tutorial
      • Bash Apps
      • Python Apps
      • Spark Apps
        • Table Exporter
        • DX Spark Submit Utility
      • HTTPS Apps
        • Isolated Browsing for HTTPS Apps
      • Transitioning from Applets to Apps
      • Third Party and Community Apps
        • Community App Guidelines
        • Third Party App Style Guide
        • Third Party App Publishing Checklist
      • App Metadata
      • App Permissions
      • App Execution Environment
        • Connecting to Jobs
      • Dependency Management
        • Asset Build Process
        • Docker Images
        • Python package installation in Ubuntu 24.04 AEE
      • Job Identity Tokens for Access to Clouds and Third-Party Services
      • Enabling Web Application Users to Log In with DNAnexus Credentials
      • Types of Errors
    • Workflows
      • Importing Workflows
      • Introduction to Building Workflows
      • Building and Running Workflows
      • Workflow Build Process
      • Versioning and Publishing Global Workflows
      • Workflow Metadata
    • Ingesting Data
      • Molecular Expression Assay Loader
        • Common Errors
        • Example Usage
        • Example Input
      • Data Model Loader
        • Data Ingestion Key Steps
        • Ingestion Data Types
        • Data Files Used by the Data Model Loader
        • Troubleshooting
      • Dataset Extender
        • Using Dataset Extender
    • Dataset Management
      • Rebase Cohorts and Dashboards
      • Assay Dataset Merger
      • Clinical Dataset Merger
    • Apollo Datasets
      • Dataset Versions
      • Cohorts
    • Creating Custom Viewers
    • Client Libraries
      • Support for Python 3
    • Walkthroughs
      • Creating a Mixed Phenotypic Assay Dataset
      • Guide for Ingesting a Simple Four Table Dataset
    • DNAnexus API
      • Entity IDs
      • Protocols
      • Authentication
      • Regions
      • Nonces
      • Users
      • Organizations
      • OIDC Clients
      • Data Containers
        • Folders and Deletion
        • Cloning
        • Project API Methods
        • Project Permissions and Sharing
      • Data Object Lifecycle
        • Types
        • Object Details
        • Visibility
      • Data Object Metadata
        • Name
        • Properties
        • Tags
      • Data Object Classes
        • Records
        • Files
        • Databases
        • Drives
        • DBClusters
      • Running Analyses
        • I/O and Run Specifications
        • Instance Types
        • Job Input and Output
        • Applets and Entry Points
        • Apps
        • Workflows and Analyses
        • Global Workflows
        • Containers for Execution
      • Search
      • System Methods
      • Directory of API Methods
      • DNAnexus Service Limits
  • Administrator
    • Billing
    • Org Management
    • Single Sign-On
    • Audit Trail
    • Integrating with External Services
    • Portal Setup
    • GxP
      • Controlled Tool Access (allowed executables)
  • Science Corner
    • Scientific Guides
      • Somatic Small Variant and CNV Discovery Workflow Walkthrough
      • SAIGE GWAS Walkthrough
      • LocusZoom DNAnexus App
      • Human Reference Genomes
    • Using Hail to Analyze Genomic Data
    • Open-Source Tools by DNAnexus Scientists
    • Using IGV Locally with DNAnexus
  • Downloads
  • FAQs
    • EOL Documentation
      • Python 3 Support and Python 2 End of Life (EOL)
    • Automating Analysis Workflow
    • Backups of Customer Data
    • Developing Apps and Applets
    • Importing Data
    • Platform Uptime
    • Legal and Compliance
    • Sharing and Collaboration
    • Product Version Numbering
  • Release Notes
  • Technical Support
  • Legal
Powered by GitBook

Copyright 2025 DNAnexus

On this page
  • Jobs
  • Types of Executables
  • Components of an Applet
  • Jobs
  • Job Hierarchy
  • Project Context and Temporary Workspace
  • Usage Charges
  • Project Permissions
  • Temporary Workspaces
  • Data Object State and Job Input and Output
  • Example: Inputs from Different Projects
  • Example: Chained Execution

Was this helpful?

Export as PDF
  1. Developer
  2. DNAnexus API

Running Analyses

In this section, learn about the API for creating and running analyses on the DNAnexus Platform.

Last updated 7 months ago

Was this helpful?

Jobs

The job is the unit of execution on the DNAnexus Platform. For every job, a worker is spun up in the cloud, then the job's code is downloaded to that worker and executed. The job may make API calls, perform computations, or spawn other jobs.

For more on the lifecycle of executions on the DNAnexus Platform, see the page.

Types of Executables

Three types of executables can be run in the course of a job: applets, apps, and workflows.

Applets are data objects that reside in projects, and are the fundamental building block of all executables on the Platform. Applets contain all the data and metadata required to run a job.

Each app is an applet that's been packaged to facilitate versioning and easy sharing with other users. Like applets, each app produces a job when run. Unlike applets, apps are not data objects, and do not reside in projects.

Workflows are data objects that contain the necessary metadata for creating a pipeline of one or more apps or applets, so these can be run in a specific sequence, as a single analysis. Unlike apps and applets, each workflow produces not a single job, but rather a series of jobs that are run in the course of executing the full pipeline.

Components of an Applet

Whenever a job runs on a worker, it is running an applet, either as such, or packaged into an app.

An applet has some or all of the following components:

Jobs

When an applet or app is run, a job is created, and the main function, or entry point, of the applet's code is executed on a worker node on the DNAnexus Platform. This code must be bash or Python 3, though it can spawn other Linux processes - by, for example, running executables written in other languages - to perform tasks.

Like data objects, jobs can be tagged with, and searched by, metadata.

Job Hierarchy

Project Context and Temporary Workspace

An executable is always launched from a particular project. Any child jobs descendant from the resulting origin job inherit its project context. Project context is significant in several ways.

Usage Charges

The project is billed for all usage charges resulting from the execution of both the origin job and all its child jobs.

Project Permissions

When launching an executable from within a project, a user must have "CONTRIBUTE" access to that project. This enables the origin job, when outputting data objects, to place them in the project.

By default, an applet has "VIEW" permission to the project in which it resides. An app, meanwhile, has no default project permission setting.

Temporary Workspaces

Jobs running as part of the same executable - either an origin or master job and all its descendants - always share the same temporary workspace. This workspace is a container for objects that the executable can read from and write to, on the Platform.

Note that these temporary workspaces are distinct from the local disk that each job receives on its worker node. Jobs must explicitly upload data to the Platform in order to share it with other jobs, or deliver it as output.

Jobs always receive "CONTRIBUTE" permission to their temporary workspace. When provided as inputs to an executable, data objects, and all hidden objects to which they link, are cloned into the workspace before the executable begins running. If any of these objects reside in projects other than the one in which the executable is being run, the user or job launching the executable must have "VIEW" access to those other projects, and those other projects must not have the "RESTRICTED" flag set. Upon completion of the job, output objects are cloned into the project from which the executable was launched, and the workspace is destroyed.

Data Object State and Job Input and Output

A job cannot start until data objects it uses as input are ready. Since the system must clone these data objects into a job's temporary workspace, the job will not start until the state of these objects is "closed." Likewise, on the conclusion of a run, if an origin or master job is to output any objects, its state will be "waiting_on_output" until output objects have transitioned to the "closed" state.

Example: Inputs from Different Projects

Example: Chained Execution

If an applet, while running, launches an applet, then the project context is carried forward, but a new workspace is made for the launched applet. The launched applet has "VIEW" access to the original project, and "CONTRIBUTE" access to its workspace - but no access to the workspace of the applet that launched it. When the launched applet is done, any objects output by the job are cloned back into the workspace of the parent applet.

The figure below illustrates an example where Applet1 produces Object C as output, and then provides Object C as an input, when launching Applet2.

Because Applet1 was launched from Project A, both Applet1's jobs and Applet2's jobs have "VIEW" access - indicated by the black arrows - to Project A. But Applet1's jobs do not have any permissions to the temporary workspace used by Applet2's jobs; nor do Applet2's jobs have any permissions to the temporary workspace used by Applet1's jobs.

Note that if Applet2 were an app rather than an applet, its jobs would have no access to Project A. If an app launches an applet, meanwhile, the app's permissions define the maximum access level at which the applet can be run. Thus the applet, in this scenario, would have no access to the project context.

: If included, input specifications detail the characteristics of named inputs to be provided to the applet. For example, it might be specified that, for an input called "reads," a file be provided.

: If included, output specifications detail the characteristics of outputs generated by the applet. For example, it might be specified that, for an output field called "mappings," the applet will generate a File.

: This is the code that is actually run on the worker. The code must be bash or Python 3. The code can consist of multiple functions, or entry points. See for more information on writing entry points.

: If an applet requires additional files or programs that a developer has compiled - perhaps written in a different language, such as C++ - these can be bundled with the applet and made available when it is run.

Additional resource requirements: If the applet requires specific additional resources to run, these can be specified. These might include additional computational power or memory, software packages, additional network access, and special project permissions. For details on how to write these specifications, see the and sections of the page.

The job runs in the , a fully capable, isolated Linux environment. The DNAnexus Platform API server is always available to the job.

A job can launch other jobs by running an executable directly - for example, via the API calls or - or by calling another entry point in its own executable, via the API call . Jobs launched by another job are called child jobs; the job that launched them is called the parent job. The original job created, when the user runs an applet or app, is called an origin job. A job created when a job runs an applet or app is called a master job.

For a list of job-related terms, and their definitions, see this .

Jobs can depend on each other or on data objects, so that, for example, a job might not start until other jobs are finished or certain data objects are closed. These dependencies can be implicit, via provided in the input, or via the dependsOn field in the API call made to create the new job.

Applets and apps can require, in their , that they be given specific permissions to the projects of any user launching them.

Temporary workspaces behave like projects, except they cannot be explicitly created or destroyed, and their permissions are fixed. See for more about Platform data containers. See for specifics on the types of containers involved in app and applet execution.

Job and Analysis Lifecycles
Applets
Apps
Workflows
Execution Environment
Data Containers
Containers for Execution
I/O and Run Specifications
Job-based Object References
Glossary
Input specification
Output specification
Bundled files
Run Specification
Access Requirements
access requirements
/applet-xxxx/run
/job/new
Code
Code Interpreters
/app-xxxx/run