DNAnexus Documentation
APIDownloadsIndex of dx CommandsLegal
  • Overview
  • Getting Started
    • DNAnexus Essentials
    • Key Concepts
      • Projects
      • Organizations
      • Apps and Workflows
    • User Interface Quickstart
    • Command Line Quickstart
    • Developer Quickstart
    • Developer Tutorials
      • Bash
        • Bash Helpers
        • Distributed by Chr (sh)
        • Distributed by Region (sh)
        • SAMtools count
        • TensorBoard Example Web App
        • Git Dependency
        • Mkfifo and dx cat
        • Parallel by Region (sh)
        • Parallel xargs by Chr
        • Precompiled Binary
        • R Shiny Example Web App
      • Python
        • Dash Example Web App
        • Distributed by Region (py)
        • Parallel by Chr (py)
        • Parallel by Region (py)
        • Pysam
        • TensorBoard Example Web App
      • Concurrent Computing Tutorials
        • Distributed
          • Distributed by Region (sh)
          • Distributed by Chr (sh)
          • Distributed by Region (py)
        • Parallel
          • Parallel by Chr (py)
          • Parallel by Region (py)
          • Parallel by Region (sh)
          • Parallel xargs by Chr
  • User
    • Login and Logout
    • Projects
      • Project Navigation
      • Path Resolution
    • Running Apps and Workflows
      • Running Apps and Applets
      • Running Workflows
      • Running Nextflow Pipelines
      • Running Batch Jobs
      • Monitoring Executions
      • Job Notifications
      • Job Lifecycle
      • Executions and Time Limits
      • Executions and Cost and Spending Limits
      • Smart Reuse (Job Reuse)
      • Apps and Workflows Glossary
      • Tools List
    • Cohort Browser
      • Chart Types
        • Row Chart
        • Histogram
        • Box Plot
        • List View
        • Grouped Box Plot
        • Stacked Row Chart
        • Scatter Plot
        • Kaplan-Meier Survival Curve
      • Locus Details Page
    • Using DXJupyterLab
      • DXJupyterLab Quickstart
      • Running DXJupyterLab
        • FreeSurfer in DXJupyterLab
      • Spark Cluster-Enabled DXJupyterLab
        • Exploring and Querying Datasets
      • Stata in DXJupyterLab
      • Running Older Versions of DXJupyterLab
      • DXJupyterLab Reference
    • Using Spark
      • Apollo Apps
      • Connect to Thrift
      • Example Applications
        • CSV Loader
        • SQL Runner
        • VCF Loader
      • VCF Preprocessing
    • Environment Variables
    • Objects
      • Describing Data Objects
      • Searching Data Objects
      • Visualizing Data
      • Filtering Objects and Jobs
      • Archiving Files
      • Relational Database Clusters
      • Symlinks
      • Uploading and Downloading Files
        • Small File Sets
          • dx upload
          • dx download
        • Batch
          • Upload Agent
          • Download Agent
    • Platform IDs
    • Organization Member Guide
    • Index of dx commands
  • Developer
    • Developing Portable Pipelines
      • dxCompiler
    • Cloud Workstation
    • Apps
      • Introduction to Building Apps
      • App Build Process
      • Advanced Applet Tutorial
      • Bash Apps
      • Python Apps
      • Spark Apps
        • Table Exporter
        • DX Spark Submit Utility
      • HTTPS Apps
        • Isolated Browsing for HTTPS Apps
      • Transitioning from Applets to Apps
      • Third Party and Community Apps
        • Community App Guidelines
        • Third Party App Style Guide
        • Third Party App Publishing Checklist
      • App Metadata
      • App Permissions
      • App Execution Environment
        • Connecting to Jobs
      • Dependency Management
        • Asset Build Process
        • Docker Images
        • Python package installation in Ubuntu 24.04 AEE
      • Job Identity Tokens for Access to Clouds and Third-Party Services
      • Enabling Web Application Users to Log In with DNAnexus Credentials
      • Types of Errors
    • Workflows
      • Importing Workflows
      • Introduction to Building Workflows
      • Building and Running Workflows
      • Workflow Build Process
      • Versioning and Publishing Global Workflows
      • Workflow Metadata
    • Ingesting Data
      • Molecular Expression Assay Loader
        • Common Errors
        • Example Usage
        • Example Input
      • Data Model Loader
        • Data Ingestion Key Steps
        • Ingestion Data Types
        • Data Files Used by the Data Model Loader
        • Troubleshooting
      • Dataset Extender
        • Using Dataset Extender
    • Dataset Management
      • Rebase Cohorts and Dashboards
      • Assay Dataset Merger
      • Clinical Dataset Merger
    • Apollo Datasets
      • Dataset Versions
      • Cohorts
    • Creating Custom Viewers
    • Client Libraries
      • Support for Python 3
    • Walkthroughs
      • Creating a Mixed Phenotypic Assay Dataset
      • Guide for Ingesting a Simple Four Table Dataset
    • DNAnexus API
      • Entity IDs
      • Protocols
      • Authentication
      • Regions
      • Nonces
      • Users
      • Organizations
      • OIDC Clients
      • Data Containers
        • Folders and Deletion
        • Cloning
        • Project API Methods
        • Project Permissions and Sharing
      • Data Object Lifecycle
        • Types
        • Object Details
        • Visibility
      • Data Object Metadata
        • Name
        • Properties
        • Tags
      • Data Object Classes
        • Records
        • Files
        • Databases
        • Drives
        • DBClusters
      • Running Analyses
        • I/O and Run Specifications
        • Instance Types
        • Job Input and Output
        • Applets and Entry Points
        • Apps
        • Workflows and Analyses
        • Global Workflows
        • Containers for Execution
      • Search
      • System Methods
      • Directory of API Methods
      • DNAnexus Service Limits
  • Administrator
    • Billing
    • Org Management
    • Single Sign-On
    • Audit Trail
    • Integrating with External Services
    • Portal Setup
    • GxP
      • Controlled Tool Access (allowed executables)
  • Science Corner
    • Scientific Guides
      • Somatic Small Variant and CNV Discovery Workflow Walkthrough
      • SAIGE GWAS Walkthrough
      • LocusZoom DNAnexus App
      • Human Reference Genomes
    • Using Hail to Analyze Genomic Data
    • Open-Source Tools by DNAnexus Scientists
    • Using IGV Locally with DNAnexus
  • Downloads
  • FAQs
    • EOL Documentation
      • Python 3 Support and Python 2 End of Life (EOL)
    • Automating Analysis Workflow
    • Backups of Customer Data
    • Developing Apps and Applets
    • Importing Data
    • Platform Uptime
    • Legal and Compliance
    • Sharing and Collaboration
    • Product Version Numbering
  • Release Notes
  • Technical Support
  • Legal
Powered by GitBook

Copyright 2025 DNAnexus

On this page
  • Jobs and their Role
  • Types of Executables
  • Applets
  • Apps
  • Workflows
  • Components of an Applet
  • Jobs
  • Job Hierarchy
  • Project Context and Temporary Workspace
  • Usage Charges
  • Project Permissions
  • Temporary Workspaces
  • Data Object State and Job Input and Output
  • Example: Inputs from Different Projects
  • Example: Chained Execution

Was this helpful?

Export as PDF
  1. Developer
  2. DNAnexus API

Running Analyses

In this section, learn about the API for creating and running analyses on the DNAnexus Platform.

Jobs and their Role

The job is the unit of execution on the DNAnexus Platform. For every job, a worker is spun up in the cloud, then the job's code is downloaded to that worker and executed. The job may make API calls, perform computations, or spawn other jobs.

For more on the lifecycle of executions on the DNAnexus Platform, see the Job and Analysis Lifecycles page.

Types of Executables

Three types of executables can be run in the course of a job: applets, apps, and workflows.

Applets

Applets are data objects that reside in projects, and are the fundamental building block of all executables on the Platform. Applets contain all the data and metadata required to run a job.

Apps

Each app is an applet that's been packaged to facilitate versioning and easy sharing with other users. Like applets, each app produces a job when run. Unlike applets, apps are not data objects, and do not reside in projects.

Workflows

Workflows are data objects that contain the necessary metadata for creating a pipeline of one or more apps or applets, so these can be run in a specific sequence, as a single analysis. Unlike apps and applets, each workflow produces not a single job, but rather a series of jobs that are run in the course of executing the full pipeline.

Components of an Applet

Whenever a job runs on a worker, it is running an applet, either as such, or packaged into an app.

An applet has some or all of the following components:

  • Input specification: If included, input specifications detail the characteristics of named inputs to be provided to the applet. For example, it might be specified that, for an input called "reads," a file be provided.

  • Output specification: If included, output specifications detail the characteristics of outputs generated by the applet. For example, it might be specified that, for an output field called "mappings," the applet will generate a File.

  • Code: This is the code that is actually run on the worker. The code must be bash or Python 3. The code can consist of multiple functions, or entry points. See Code Interpreters for more information on writing entry points.

  • Bundled files: If an applet requires additional files or programs that a developer has compiled - perhaps written in a different language, such as C++ - these can be bundled with the applet and made available when it is run.

  • Additional resource requirements: If the applet requires specific additional resources to run, these can be specified. These might include additional computational power or memory, software packages, additional network access, and special project permissions. For details on how to write these specifications, see the Run Specification and Access Requirements sections of the I/O and Run Specifications page.

Jobs

When an applet or app is run, a job is created, and the main function, or entry point, of the applet's code is executed on a worker node on the DNAnexus Platform. This code must be bash or Python 3, though it can spawn other Linux processes - by, for example, running executables written in other languages - to perform tasks.

The job runs in the Execution Environment, a fully capable, isolated Linux environment. The DNAnexus Platform API server is always available to the job.

Like data objects, jobs can be tagged with, and searched by, metadata.

Job Hierarchy

A job can launch other jobs by running an executable directly - for example, via the API calls /applet-xxxx/run or /app-xxxx/run - or by calling another entry point in its own executable, via the API call /job/new. Jobs launched by another job are called child jobs; the job that launched them is called the parent job. The original job created, when the user runs an applet or app, is called an origin job. A job created when a job runs an applet or app is called a master job.

For a list of job-related terms, and their definitions, see this Glossary.

Jobs can depend on each other or on data objects, so that, for example, a job might not start until other jobs are finished or certain data objects are closed. These dependencies can be implicit, via Job-based Object References provided in the input, or via the dependsOn field in the API call made to create the new job.

Project Context and Temporary Workspace

An executable is always launched from a particular project. Any child jobs descendant from the resulting origin job inherit its project context. Project context is significant in several ways.

Usage Charges

The project is billed for all usage charges resulting from the execution of both the origin job and all its child jobs.

Project Permissions

When launching an executable from within a project, a user must have "CONTRIBUTE" access to that project. This enables the origin job, when outputting data objects, to place them in the project.

By default, an applet has "VIEW" permission to the project in which it resides. An app, meanwhile, has no default project permission setting.

Applets and apps can require, in their access requirements, that they be given specific permissions to the projects of any user launching them.

Temporary Workspaces

Jobs running as part of the same executable - either an origin or master job and all its descendants - always share the same temporary workspace. This workspace is a container for objects that the executable can read from and write to, on the Platform.

Note that these temporary workspaces are distinct from the local disk that each job receives on its worker node. Jobs must explicitly upload data to the Platform in order to share it with other jobs, or deliver it as output.

Temporary workspaces behave like projects, except they cannot be explicitly created or destroyed, and their permissions are fixed. See Data Containers for more about Platform data containers. See Containers for Execution for specifics on the types of containers involved in app and applet execution.

Jobs always receive "CONTRIBUTE" permission to their temporary workspace. When provided as inputs to an executable, data objects, and all hidden objects to which they link, are cloned into the workspace before the executable begins running. If any of these objects reside in projects other than the one in which the executable is being run, the user or job launching the executable must have "VIEW" access to those other projects, and those other projects must not have the "RESTRICTED" flag set. Upon completion of the job, output objects are cloned into the project from which the executable was launched, and the workspace is destroyed.

Data Object State and Job Input and Output

A job cannot start until data objects it uses as input are ready. Since the system must clone these data objects into a job's temporary workspace, the job will not start until the state of these objects is "closed." Likewise, on the conclusion of a run, if an origin or master job is to output any objects, its state will be "waiting_on_output" until output objects have transitioned to the "closed" state.

Example: Inputs from Different Projects

Example: Chained Execution

If an applet, while running, launches an applet, then the project context is carried forward, but a new workspace is made for the launched applet. The launched applet has "VIEW" access to the original project, and "CONTRIBUTE" access to its workspace - but no access to the workspace of the applet that launched it. When the launched applet is done, any objects output by the job are cloned back into the workspace of the parent applet.

The figure below illustrates an example where Applet1 produces Object C as output, and then provides Object C as an input, when launching Applet2.

Because Applet1 was launched from Project A, both Applet1's jobs and Applet2's jobs have "VIEW" access - indicated by the black arrows - to Project A. But Applet1's jobs do not have any permissions to the temporary workspace used by Applet2's jobs; nor do Applet2's jobs have any permissions to the temporary workspace used by Applet1's jobs.

Note that if Applet2 were an app rather than an applet, its jobs would have no access to Project A. If an app launches an applet, meanwhile, the app's permissions define the maximum access level at which the applet can be run. Thus the applet, in this scenario, would have no access to the project context.

Last updated 16 days ago

Was this helpful?