DNAnexus Documentation
APIDownloadsIndex of dx CommandsLegal
  • Overview
  • Getting Started
    • DNAnexus Essentials
    • Key Concepts
      • Projects
      • Organizations
      • Apps and Workflows
    • User Interface Quickstart
    • Command Line Quickstart
    • Developer Quickstart
    • Developer Tutorials
      • Bash
        • Bash Helpers
        • Distributed by Chr (sh)
        • Distributed by Region (sh)
        • SAMtools count
        • TensorBoard Example Web App
        • Git Dependency
        • Mkfifo and dx cat
        • Parallel by Region (sh)
        • Parallel xargs by Chr
        • Precompiled Binary
        • R Shiny Example Web App
      • Python
        • Dash Example Web App
        • Distributed by Region (py)
        • Parallel by Chr (py)
        • Parallel by Region (py)
        • Pysam
      • Web App(let) Tutorials
        • Dash Example Web App
        • TensorBoard Example Web App
      • Concurrent Computing Tutorials
        • Distributed
          • Distributed by Region (sh)
          • Distributed by Chr (sh)
          • Distributed by Region (py)
        • Parallel
          • Parallel by Chr (py)
          • Parallel by Region (py)
          • Parallel by Region (sh)
          • Parallel xargs by Chr
  • User
    • Login and Logout
    • Projects
      • Project Navigation
      • Path Resolution
    • Running Apps and Workflows
      • Running Apps and Applets
      • Running Workflows
      • Running Nextflow Pipelines
      • Running Batch Jobs
      • Monitoring Executions
      • Job Notifications
      • Job Lifecycle
      • Executions and Time Limits
      • Executions and Cost and Spending Limits
      • Smart Reuse (Job Reuse)
      • Apps and Workflows Glossary
      • Tools List
    • Cohort Browser
      • Chart Types
        • Row Chart
        • Histogram
        • Box Plot
        • List View
        • Grouped Box Plot
        • Stacked Row Chart
        • Scatter Plot
        • Kaplan-Meier Survival Curve
      • Locus Details Page
    • Using DXJupyterLab
      • DXJupyterLab Quickstart
      • Running DXJupyterLab
        • FreeSurfer in DXJupyterLab
      • Spark Cluster-Enabled DXJupyterLab
        • Exploring and Querying Datasets
      • Stata in DXJupyterLab
      • Running Older Versions of DXJupyterLab
      • DXJupyterLab Reference
    • Using Spark
      • Apollo Apps
      • Connect to Thrift
      • Example Applications
        • CSV Loader
        • SQL Runner
        • VCF Loader
      • VCF Preprocessing
    • Environment Variables
    • Objects
      • Describing Data Objects
      • Searching Data Objects
      • Visualizing Data
      • Filtering Objects and Jobs
      • Archiving Files
      • Relational Database Clusters
      • Symlinks
      • Uploading and Downloading Files
        • Small File Sets
          • dx upload
          • dx download
        • Batch
          • Upload Agent
          • Download Agent
    • Platform IDs
    • Organization Member Guide
    • Index of dx commands
  • Developer
    • Developing Portable Pipelines
      • dxCompiler
    • Cloud Workstation
    • Apps
      • Introduction to Building Apps
      • App Build Process
      • Advanced Applet Tutorial
      • Bash Apps
      • Python Apps
      • Spark Apps
        • Table Exporter
        • DX Spark Submit Utility
      • HTTPS Apps
        • Isolated Browsing for HTTPS Apps
      • Transitioning from Applets to Apps
      • Third Party and Community Apps
        • Community App Guidelines
        • Third Party App Style Guide
        • Third Party App Publishing Checklist
      • App Metadata
      • App Permissions
      • App Execution Environment
        • Connecting to Jobs
      • Dependency Management
        • Asset Build Process
        • Docker Images
        • Python package installation in Ubuntu 24.04 AEE
      • Job Identity Tokens for Access to Clouds and Third-Party Services
      • Enabling Web Application Users to Log In with DNAnexus Credentials
      • Types of Errors
    • Workflows
      • Importing Workflows
      • Introduction to Building Workflows
      • Building and Running Workflows
      • Workflow Build Process
      • Versioning and Publishing Global Workflows
      • Workflow Metadata
    • Ingesting Data
      • Molecular Expression Assay Loader
        • Common Errors
        • Example Usage
        • Example Input
      • Data Model Loader
        • Data Ingestion Key Steps
        • Ingestion Data Types
        • Data Files Used by the Data Model Loader
        • Troubleshooting
      • Dataset Extender
        • Using Dataset Extender
    • Dataset Management
      • Rebase Cohorts and Dashboards
      • Assay Dataset Merger
      • Clinical Dataset Merger
    • Apollo Datasets
      • Dataset Versions
      • Cohorts
    • Creating Custom Viewers
    • Client Libraries
      • Support for Python 3
    • Walkthroughs
      • Creating a Mixed Phenotypic Assay Dataset
      • Guide for Ingesting a Simple Four Table Dataset
    • DNAnexus API
      • Entity IDs
      • Protocols
      • Authentication
      • Regions
      • Nonces
      • Users
      • Organizations
      • OIDC Clients
      • Data Containers
        • Folders and Deletion
        • Cloning
        • Project API Methods
        • Project Permissions and Sharing
      • Data Object Lifecycle
        • Types
        • Object Details
        • Visibility
      • Data Object Metadata
        • Name
        • Properties
        • Tags
      • Data Object Classes
        • Records
        • Files
        • Databases
        • Drives
        • DBClusters
      • Running Analyses
        • I/O and Run Specifications
        • Instance Types
        • Job Input and Output
        • Applets and Entry Points
        • Apps
        • Workflows and Analyses
        • Global Workflows
        • Containers for Execution
      • Search
      • System Methods
      • Directory of API Methods
      • DNAnexus Service Limits
  • Administrator
    • Billing
    • Org Management
    • Single Sign-On
    • Audit Trail
    • Integrating with External Services
    • Portal Setup
    • GxP
      • Controlled Tool Access (allowed executables)
  • Science Corner
    • Scientific Guides
      • Somatic Small Variant and CNV Discovery Workflow Walkthrough
      • SAIGE GWAS Walkthrough
      • LocusZoom DNAnexus App
      • Human Reference Genomes
    • Using Hail to Analyze Genomic Data
    • Open-Source Tools by DNAnexus Scientists
    • Using IGV Locally with DNAnexus
  • Downloads
  • FAQs
    • EOL Documentation
      • Python 3 Support and Python 2 End of Life (EOL)
    • Automating Analysis Workflow
    • Backups of Customer Data
    • Developing Apps and Applets
    • Importing Data
    • Platform Uptime
    • Legal and Compliance
    • Sharing and Collaboration
    • Product Version Numbering
  • Release Notes
  • Technical Support
  • Legal
Powered by GitBook

Copyright 2025 DNAnexus

On this page
  • Describing an Object by Name
  • Describe an Object With a Relative Path
  • Describe an Object in a Different Project Using an Absolute Path
  • Describe an Object Using Object ID
  • Manipulating Outputs
  • General Response Fields Overview

Was this helpful?

Export as PDF
  1. User
  2. Objects

Describing Data Objects

Last updated 2 years ago

Was this helpful?

You can describe objects (files, app(let)s, and workflows) on the DNAnexus platform using the command .

Describing an Object by Name

Objects can be described using their DNAnexus platform name via the command line interface (CLI) using a path.

Describe an Object With a Relative Path

Objects can be described relative to the user's current directory on the DNAnexus platform. In the following example, we describe the indexed reference genome file human_g1k_v37.bwa-index.tar.gz.

$ dx describe "Original files/human_g1k_v37.bwa-index.tar.gz"
Result 1:
ID                file-xxxx
Class             file
Project           project-xxxx
Folder            /Original files
Name              human_g1k_v37.bwa-index.tar.gz
State             closed
Visibility        visible
Types             -
Properties        -
Tags              -
Outgoing links    -
Created           ----
Created by        Amy
 via the job      job-xxxx
Last modified     ----
archivalState     "live"
Size              3.21 GB

NOTE: The entire path is enclosed in quotes due to the space in the folder name Original files. Instead of quotes, you can escape special characters with the \ character: dx describe Original\ files/human_g1k_v37.bwa-index.tar.gz.

Describe an Object in a Different Project Using an Absolute Path

$ dx select "My Research Project"
$ dx describe Reference\ Genome\ Files:H.\ Sapiens\ -\ GRCh37\ -\ b37\ (1000\ Genomes\ Phase\ I)/human_g1k_v37.fa.gz
Result 1:
ID                file-xxxx
Class             file
Project           project-xxxx
Folder           /H. Sapiens - GRCh37 - b37 (1000 Genomes Phase I)
Name              human_g1k_v37.fa.gz
State             closed
Visibility        visible
Types             -
Properties        -
Tags              -
Outgoing links    -
Created           ----
Created by        Amy
 via the job      job-xxxx
Last modified     ----
archivalState     "live"
Size              810.45 MB

Describe an Object Using Object ID

Objects can be described using a unique object ID.

In this example, we describe workflow object "Exome Analysis Workflow" using its ID. This workflow is publicly available in the "Exome Analysis Demo" project.

$ dx describe "Exome Analysis Demo":workflow-G409jQQ0bZ46x5GF4GXqKxZ0
Result 1:
ID                  workflow-G409jQQ0bZ46x5GF4GXqKxZ0
Class               workflow
Project             project-BQfgzV80bZ46kf6pBGy00J38
Folder              /
Name                Exome Analysis Workflow
....
Stage 0             bwa_mem_fastq_read_mapper
  Executable        app-bwa_mem_fastq_read_mapper/2.0.1
Stage 1             fastqc
  Executable        app-fastqc/3.0.1
Stage 2             gatk4_bqsr
  Executable        app-gatk4_bqsr_parallel/2.0.1
Stage 3             gatk4_haplotypecaller
  Executable        app-gatk4_haplotypecaller_parallel/2.0.1
Stage 4             gatk4_genotypegvcfs
  Executable        app-gatk4_genotypegvcfs_single_sample_parallel/2.0.0

Due to the amount of information contained in a workflow (including multiple app(let)s, inputs/outputs, and default parameters), the dx describe output can seem overwhelming.

Manipulating Outputs

The output from a dx describe command can be used for various purposes. The optional argument --json will convert the output from dx describe into JSON format for advanced scripting and command line use.

In this example, we will describe the publicly available workflow object "Exome Analysis Workflow" and return the output in JSON format.

$ dx describe "Exome Analysis Demo":workflow-G409jQQ0bZ46x5GF4GXqKxZ0 --json
  {
    "project": "project-BQfgzV80bZ46kf6pBGy00J38",
    "name": "Exome Analysis Workflow",
    "inputSpec": [
      {
        "name": "bwa_mem_fastq_read_mapper.reads_fastqgzs",
        "class": "array:file",
        "help": "An array of files, in gzipped FASTQ format, with the first read mates to be mapped.",
        "patterns": [ "*.fq.gz", "*.fastq.gz" ],
        ...
      },
      ...
    ],
    "stages": [
      {
        "id": "bwa_mem_fastq_read_mapper",
        "executable": "app-bwa_mem_fastq_read_mapper/2.0.1",
        "input": {
          "genomeindex_targz": {
            "$dnanexus_link": {
              "project": "project-BQpp3Y804Y0xbyG4GJPQ01xv",
              "id": "file-FFJPKp0034KY8f20F6V9yYkk"
            }
          }
        },
        ...
      },
      {
        "id": "fastqc",
        "executable": "app-fastqc/3.0.1",
        ...
      }
      ...
    ]
  }
$ dx describe "Exome Analysis Demo":workflow-G409jQQ0bZ46x5GF4GXqKxZ0 --json |jq .stages
[{
    "id": "bwa_mem_fastq_read_mapper",
    "executable": "app-bwa_mem_fastq_read_mapper/2.0.1",
  ...
  }, {
    "id": "fastqc",
    "executable": "app-fastqc/3.0.1",
  ...
  }, {
    "id": "gatk4_bqsr",
    "executable": "app-gatk4_bqsr_parallel/2.0.1",
  ...
  }
  ...
}]

We can output the "executable" value of each stage present in the "stages" value of the dx describe output above using the command below.

$ dx describe "Exome Analysis Demo":workflow-G409jQQ0bZ46x5GF4GXqKxZ0 --json | jq '.stages | map(.executable) | .[]'
  "app-bwa_mem_fastq_read_mapper/2.0.1"
  "app-fastqc/3.0.1"
  "app-gatk4_bqsr_parallel/2.0.1"
  "app-gatk4_haplotypecaller_parallel/2.0.1"
  "app-gatk4_genotypegvcfs_single_sample_parallel/2.0.0"

General Response Fields Overview

Field name

Objects

Description

All

Unique ID assigned to a DNAnexus object.

Class

All

DNAnexus object type.

Project

All

Container where the object is stored.

Folder

All

Objects inside a container (project) can be organized into folders. Objects can only exist in one path within a project.

Name

All

Object name on the platform.

All

Status of the object on the platform.

Visibility

All

Whether or not the file is visible to the user through the platform web interface.

Tags

All

Set of tags associated with an object. Tags are strings used to organize or annotate objects.

Properties

All

Key/value pairs attached to object.

All

JSON reference to another object on the platform. Linked objects will be copied along with the object if the object is cloned to another project.

Created

All

Date and time object was created.

Created by

All

DNAnexus user who created the object. Contains subfield “via the job” if the object was created as a result of an app or applet.

Last modified

All

Date and time the object was last modified.

Input Spec

App(let)s and Workflows

App(let) or workflow input names and classes. With workflows, the corresponding applet stage ID is also provided.

Output Spec

App(let) and Workflows

App(let) or workflow output names and classes. With workflows, the corresponding applet stage ID is also provided.

Objects can be described using an absolute path. This allows us to describe objects outside the current project context. In the following example, we the project "My Research Project" and dx describe the file human_g1k_v37.fa.gz in the "Reference Genome Files" project.

We can parse, process, and query the JSON output using . Below, we process the dx describe --json output to generate a list of all stages in the aforementioned exome analysis pipeline.

jq
ID
dx describe
dx select
Outgoing Links
State