DNAnexus Documentation
APIDownloadsIndex of dx CommandsLegal
  • Overview
  • Getting Started
    • DNAnexus Essentials
    • Key Concepts
      • Projects
      • Organizations
      • Apps and Workflows
    • User Interface Quickstart
    • Command Line Quickstart
    • Developer Quickstart
    • Developer Tutorials
      • Bash
        • Bash Helpers
        • Distributed by Chr (sh)
        • Distributed by Region (sh)
        • SAMtools count
        • TensorBoard Example Web App
        • Git Dependency
        • Mkfifo and dx cat
        • Parallel by Region (sh)
        • Parallel xargs by Chr
        • Precompiled Binary
        • R Shiny Example Web App
      • Python
        • Dash Example Web App
        • Distributed by Region (py)
        • Parallel by Chr (py)
        • Parallel by Region (py)
        • Pysam
      • Web App(let) Tutorials
        • Dash Example Web App
        • TensorBoard Example Web App
      • Concurrent Computing Tutorials
        • Distributed
          • Distributed by Region (sh)
          • Distributed by Chr (sh)
          • Distributed by Region (py)
        • Parallel
          • Parallel by Chr (py)
          • Parallel by Region (py)
          • Parallel by Region (sh)
          • Parallel xargs by Chr
  • User
    • Login and Logout
    • Projects
      • Project Navigation
      • Path Resolution
    • Running Apps and Workflows
      • Running Apps and Applets
      • Running Workflows
      • Running Nextflow Pipelines
      • Running Batch Jobs
      • Monitoring Executions
      • Job Notifications
      • Job Lifecycle
      • Executions and Time Limits
      • Executions and Cost and Spending Limits
      • Smart Reuse (Job Reuse)
      • Apps and Workflows Glossary
      • Tools List
    • Cohort Browser
      • Chart Types
        • Row Chart
        • Histogram
        • Box Plot
        • List View
        • Grouped Box Plot
        • Stacked Row Chart
        • Scatter Plot
        • Kaplan-Meier Survival Curve
      • Locus Details Page
    • Using DXJupyterLab
      • DXJupyterLab Quickstart
      • Running DXJupyterLab
        • FreeSurfer in DXJupyterLab
      • Spark Cluster-Enabled DXJupyterLab
        • Exploring and Querying Datasets
      • Stata in DXJupyterLab
      • Running Older Versions of DXJupyterLab
      • DXJupyterLab Reference
    • Using Spark
      • Apollo Apps
      • Connect to Thrift
      • Example Applications
        • CSV Loader
        • SQL Runner
        • VCF Loader
      • VCF Preprocessing
    • Environment Variables
    • Objects
      • Describing Data Objects
      • Searching Data Objects
      • Visualizing Data
      • Filtering Objects and Jobs
      • Archiving Files
      • Relational Database Clusters
      • Symlinks
      • Uploading and Downloading Files
        • Small File Sets
          • dx upload
          • dx download
        • Batch
          • Upload Agent
          • Download Agent
    • Platform IDs
    • Organization Member Guide
    • Index of dx commands
  • Developer
    • Developing Portable Pipelines
      • dxCompiler
    • Cloud Workstation
    • Apps
      • Introduction to Building Apps
      • App Build Process
      • Advanced Applet Tutorial
      • Bash Apps
      • Python Apps
      • Spark Apps
        • Table Exporter
        • DX Spark Submit Utility
      • HTTPS Apps
        • Isolated Browsing for HTTPS Apps
      • Transitioning from Applets to Apps
      • Third Party and Community Apps
        • Community App Guidelines
        • Third Party App Style Guide
        • Third Party App Publishing Checklist
      • App Metadata
      • App Permissions
      • App Execution Environment
        • Connecting to Jobs
      • Dependency Management
        • Asset Build Process
        • Docker Images
        • Python package installation in Ubuntu 24.04 AEE
      • Job Identity Tokens for Access to Clouds and Third-Party Services
      • Enabling Web Application Users to Log In with DNAnexus Credentials
      • Types of Errors
    • Workflows
      • Importing Workflows
      • Introduction to Building Workflows
      • Building and Running Workflows
      • Workflow Build Process
      • Versioning and Publishing Global Workflows
      • Workflow Metadata
    • Ingesting Data
      • Molecular Expression Assay Loader
        • Common Errors
        • Example Usage
        • Example Input
      • Data Model Loader
        • Data Ingestion Key Steps
        • Ingestion Data Types
        • Data Files Used by the Data Model Loader
        • Troubleshooting
      • Dataset Extender
        • Using Dataset Extender
    • Dataset Management
      • Rebase Cohorts and Dashboards
      • Assay Dataset Merger
      • Clinical Dataset Merger
    • Apollo Datasets
      • Dataset Versions
      • Cohorts
    • Creating Custom Viewers
    • Client Libraries
      • Support for Python 3
    • Walkthroughs
      • Creating a Mixed Phenotypic Assay Dataset
      • Guide for Ingesting a Simple Four Table Dataset
    • DNAnexus API
      • Entity IDs
      • Protocols
      • Authentication
      • Regions
      • Nonces
      • Users
      • Organizations
      • OIDC Clients
      • Data Containers
        • Folders and Deletion
        • Cloning
        • Project API Methods
        • Project Permissions and Sharing
      • Data Object Lifecycle
        • Types
        • Object Details
        • Visibility
      • Data Object Metadata
        • Name
        • Properties
        • Tags
      • Data Object Classes
        • Records
        • Files
        • Databases
        • Drives
        • DBClusters
      • Running Analyses
        • I/O and Run Specifications
        • Instance Types
        • Job Input and Output
        • Applets and Entry Points
        • Apps
        • Workflows and Analyses
        • Global Workflows
        • Containers for Execution
      • Search
      • System Methods
      • Directory of API Methods
      • DNAnexus Service Limits
  • Administrator
    • Billing
    • Org Management
    • Single Sign-On
    • Audit Trail
    • Integrating with External Services
    • Portal Setup
    • GxP
      • Controlled Tool Access (allowed executables)
  • Science Corner
    • Scientific Guides
      • Somatic Small Variant and CNV Discovery Workflow Walkthrough
      • SAIGE GWAS Walkthrough
      • LocusZoom DNAnexus App
      • Human Reference Genomes
    • Using Hail to Analyze Genomic Data
    • Open-Source Tools by DNAnexus Scientists
    • Using IGV Locally with DNAnexus
  • Downloads
  • FAQs
    • EOL Documentation
      • Python 3 Support and Python 2 End of Life (EOL)
    • Automating Analysis Workflow
    • Backups of Customer Data
    • Developing Apps and Applets
    • Importing Data
    • Platform Uptime
    • Legal and Compliance
    • Sharing and Collaboration
    • Product Version Numbering
  • Release Notes
  • Technical Support
  • Legal
Powered by GitBook

Copyright 2025 DNAnexus

On this page
  • Getting Help
  • Before You Begin
  • Upgrading the SDK
  • Step 1: Log In
  • Step 2: Explore
  • Public Projects
  • Describing DNAnexus Objects
  • Step 3: Create Your Own Project
  • Step 4: Upload and Manage Your Data
  • Examining Data
  • Downloading Data
  • About Metadata
  • Step 5: Analyze a Sample
  • Uploading Reads
  • Mapping Reads
  • Finding the App Name
  • Installing and Running the App
  • Monitoring Your Job
  • Terminating Your Job
  • After Your Job Finishes
  • Learn More

Was this helpful?

Export as PDF
  1. Getting Started

Command Line Quickstart

Learn to use the dx client for command-line access to the full range of DNAnexus Platform features.

Last updated 3 months ago

Was this helpful?

You must set up billing for your account before you can perform an analysis, or upload or egress data. .

The dx command-line client is included in the . You can use the dx client to log into the Platform; to upload, browse, and organize data; and to launch analyses.

All the projects and data referenced in this Quickstart are publicly available, so you can follow along step-by-step.

Getting Help

As you work, you can use this as a reference.

At the command line, you can also enter dx helpto see a list of commands, broken down by category. To see a list of commands from a particular category, enter dx help <category>.

To learn what a particular command does, enter dx help <command>, dx <command> -h, or dx <command> -help For example, enter dx help lsto learn about the command dx ls:

$ dx help ls
usage: dx ls [-h] [--color {off,on,auto}] [--delimiter [DELIMITER]]
[--env-help] [--brief | --summary | --verbose] [-a] [-l] [--obj]
[--folders] [--full]
[path]

List folders and/or objects in a folder
... # output truncated for brevity

Before You Begin

To use the command-line interface (CLI), make sure you've installed the DNAnexus Software Development Kit (SDK) available

Upgrading the SDK

Step 1: Log In

$ dx login
Acquiring credentials from https://auth.dnanexus.com
Username: <your username>
Password: <your password>

No projects to choose from.  You can create one with the command "dx new project".  To pick from projects for which you only have VIEW permissions, use "dx select --level VIEW" or "dx select --public".

Step 2: Explore

Public Projects

Let's look inside some of the public projects that have already been set up. From the command line, enter the command:

$ dx select --public --name "Reference Genome Files*"

You will never be charged for DNAnexus-sponsored data, so you can copy data from this project however many times you'd like, free of charge.

$ dx ls
C. Elegans - Ce10/
D. melanogaster - Dm3/
H. Sapiens - GRCh37 - b37 (1000 Genomes Phase I)/
H. Sapiens - GRCh37 - hs37d5 (1000 Genomes Phase II)/
H. Sapiens - GRCh38/
H. Sapiens - hg19 (Ion Torrent)/
H. Sapiens - hg19 (UCSC)/
M. musculus - mm10/
M. musculus - mm9/
$ dx ls "C. Elegans - Ce10/"
ce10.bt2-index.tar.gz
ce10.bwa-index.tar.gz
... # output truncated for brevity

You can avoid typing out the full name of the folder by typing in dx ls C and then pressing <TAB>. The folder name will auto-complete from there.

You don't have to be in a project to inspect its contents. You can also look into another project, and a folder within the project, by giving the project name or ID, followed by a colon (:) and the folder path. Here, we list the contents of the publicly available project "Demo Data" using both its name and ID.

$ dx ls "Demo Data:/SRR100022/"
SRR100022_1.filt.fastq.gz
SRR100022_2.filt.fastq.gz
$ dx ls -l "project-BQbJpBj0bvygyQxgQ1800Jkk:/SRR100022/"
Project: Demo Data (project-BQbJpBj0bvygyQxgQ1800Jkk)
Folder : /SRR100022
State   Last modified       Size     Name (ID)
... # output truncated for brevity

As shown above, you can use the -l flag in conjunction with dx ls to list more details about files, such as the time a file was last modified, its size (if applicable), and its full DNAnexus ID.

Describing DNAnexus Objects

Besides describing data and projects (examples for which are shown below), you can also describe apps, jobs, and users.

Describing a File

Below, we describe the reference genome file for C. elegans located in the "Reference Genome Files: AWS US (East)" project that we've been using (which should be accessible from other regions as well). Note that you need to add a colon (:) after the project name, here that would be Reference Genome Files\: AWS US (East): .

$ dx describe "Reference Genome Files\: AWS US (East):/C. Elegans - Ce10/ce10.fasta.gz"
Result 1:
ID                  file-BQbY9Bj015pB7JJVX0vQ7vj5
Class               file
Project             project-BQpp3Y804Y0xbyG4GJPQ01xv
Folder              /C. Elegans - Ce10
Name                ce10.fasta.gz
State               closed
Visibility          visible
Types               -
Properties          Assembly=UCSC ce10,
                    Origin=http://hgdownload.cse.ucsc.edu/goldenPath/ce10/bigZip
                    s/ce10.2bit, Species=Caenorhabditis elegans, Taxonomy
                    ID=6239
Tags                -
Outgoing links      -
Created             Tue Sep 30 18:54:35 2014
Created by          bhannigan
 via the job        job-BQbY8y80KKgP380QVQY000qz
Last modified       Thu Mar  2 12:17:27 2017
Media type          application/x-gzip
archivalState       "live"
Size                29.21 MB, sponsored by DNAnexus

Describing a Project

Below, we describe the publicly available Reference Genome Files project that we've been using.

$ dx describe "Reference Genome Files\: AWS US (East):"
Result 1:
ID                  project-BQpp3Y804Y0xbyG4GJPQ01xv
Class               project
Name                Reference Genome Files: AWS US (East)
Summary             
Billed to           org-dnanexus
Access level        VIEW
Region              aws:us-east-1
Protected           true
Restricted          false
Contains PHI        false
Created             Wed Oct  8 16:42:53 2014
Created by          tnguyen
Last modified       Tue Oct 23 14:15:59 2018
Data usage          0.00 GB
Sponsored data      519.77 GB
Sponsored egress    0.00 GB used of 0.00 GB total
Tags                -
Properties          -
downloadRestricted  false
defaultInstanceType "mem2_hdd2_x2"

Step 3: Create Your Own Project

$ dx new project "My First Project"
Created new project called "My First Project"
(project-xxxx)
Switch to new project now? [y/N]: y

You're now ready to start uploading your data and running your own analyses.

Step 4: Upload and Manage Your Data

$ dx upload --wait small-celegans-sample.fastq
[===========================================================>] Uploaded (16801690 of 16801690 bytes) 100% small-celegans-sample.fastq
ID              file-xxxx
Class           file
Project         project-xxxx
Folder          /
Name            small-celegans-sample.fastq
State           closed
Visibility      visible
Types           -
Properties      -
Tags            -
Details         {}
Outgoing links  -
Created         Sun Jan  1 09:00:00 2017
Created by      amy
Last modified   Sat Jan  1 09:00:00 2017
Media type      text/plain
Size            16.02 MB

If you run the same command but add the flag --brief, only the file ID (in the form of file-xxxx) will be printed to the terminal. Other dx commands will also accept the --brief flag and will also report only object IDs.

Examining Data

Let's run it on the file we just uploaded and use the -n flag to ask for the first 12 lines (the first 3 reads) of the FASTQ file.

$ dx head -n 12 small-celegans-sample.fastq
@SRR070372.1 FV5358E02GLGSF length=78
TTTTTTTTTTTTTTTTTTTTTTTTTTTNTTTNTTTNTTTNTTTATTTATTTATTTATTATTATATATATATATATATA
+SRR070372.1 FV5358E02GLGSF length=78
...000//////999999<<<=<<666!602!777!922!688:669A9=<=122569AAA?>@BBBBAA?=<96632
@SRR070372.2 FV5358E02FQJUJ length=177
TTTCTTGTAATTTGTTGGAATACGAGAACATCGTCAATAATATATCGTATGAATTGAACCACACGGCACATATTTGAACTTGTTCGTGAAATTTAGCGAACCTGGCAGGACTCGAACCTCCAATCTTCGGATCCGAAGTCCGACGCCCCCGCGTCGGATGCGTTGTTACCACTGCTT
+SRR070372.2 FV5358E02FQJUJ length=177
222@99912088>C<?7779@<GIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIC;6666IIIIIIIIIIII;;;HHIIE>944=>=;22499;CIIIIIIIIIIIIHHHIIIIIIIIIIIIIIIH?;;;?IIEEEEEEEEIIII77777I7EEIIEEHHHHHIIIIIIIIIIIIII
@SRR070372.3 FV5358E02GYL4S length=70
TTGGTATCATTGATATTCATTCTGGAGAACGATGGAACATACAAGAATTGTGTTAAGACCTGCATAAGGG
+SRR070372.3 FV5358E02GYL4S length=70
@@@@@DFFFFFHHHHHHHFBB@FDDBBBB=?::5555BBBBD??@?DFFHHFDDDDFFFDDBBBB<<410

Downloading Data

$ dx download small-celegans-sample.fastq
[                                                            ] Downloaded 0 byte
[===========================================================>] Downloaded 16.02 of
[===========================================================>] Completed 16.02 of 16.02 bytes (100%) small-celegans-sample.fastq

About Metadata

Files have different available fields for metadata, such as "properties" (key-value pairs) and "tags".

Step 5: Analyze a Sample

Uploading Reads

If you have not yet done so, you can upload a FASTQ file for analysis.

$ dx upload small-celegans-sample.fastq --wait

Mapping Reads

Finding the App Name

If you don't know the command-line name of the app you would like to run, you have two options:

  1. Alternatively, you can search for apps from the command line by running the command dx find apps. You will find the name of the app that you can use on the command line in the parentheses (underlined below).

$ dx find apps
...
x BWA-MEM FASTQ Read Mapper (bwa_mem_fastq_read_mapper), v1.4.0
...

Installing and Running the App

$ dx install bwa_mem_fastq_read_mapper
Installed the bwa_mem_fastq_read_mapper app
$ dx find apps --installed
BWA-MEM FASTQ Read Mapper (bwa_mem_fastq_read_mapper), v1.4.0
$ dx run bwa_mem_fastq_read_mapper
Entering interactive mode for input selection.

Input:   Reads (reads_fastqgz)
Class:   file
Enter file ID or path (<TAB> twice for compatible files in current directory,'?' for help)
reads_fastqgz[0]: <small-celegans-sample.fastq.gz>

Input:   BWA reference genome index (genomeindex_targz)
Class:   file

Suggestions:
project-BQpp3Y804Y0xbyG4GJPQ01xv://file-\* (DNAnexus Reference Genomes)
Enter file ID or path (<TAB> twice for compatible files in current
directory,'?' for more options)
genomeindex_targz: <"Reference Genome Files\: <REGION_OF_PROJECT>:/C. Elegans - Ce10/ce10.bwa-index.tar.gz">

Select an optional parameter to set by its # (^D or <ENTER> to finish):

[0] Reads (right mates) (reads2_fastqgz)
[1] Add read group information to the mappings (required by downstream GATK)? (add_read_group) [default=true]
[2] Read group id (read_group_id) [default={"$dnanexus_link": {"input": "reads_fastqgz", "metadata": "name"}}]
[3] Read group platform (read_group_platform) [default="ILLUMINA"]
[4] Read group platform unit (read_group_platform_unit) [default="None"]
[5] Read group library (read_group_library) [default="1"]
[6] Read group sample (read_group_sample) [default="1"]
[7] Output all alignments for single/unpaired reads? (all_alignments)
[8] Mark shorter split hits as secondary? (mark_as_secondary) [default=true]
[9] Advanced command line options (advanced_options)

Optional param #: <ENTER>

Using input JSON:
{
    "reads_fastqgz": {
        "$dnanexus_link": {
            "project": "project-B3X8bjBqqBk1y7bVPkvQ0001",
            "id": "file-B3P6v02KZbFFkQ2xj0JQ005Y"
        }

"genomeindex_targz": {
        "$dnanexus_link": {
            "project": "project-xxxx(project ID for the reference genome in your region)",
            "id": "file-BQbYJpQ09j3x9Fj30kf003JG"
        }
    }
}

Confirm running the applet/app with this input [Y/n]: <ENTER>
Calling app-BP2xVx80fVy0z92VYVXQ009j with output destination
     project-xxxx:/

Job ID: job-xxxx

Monitoring Your Job

$ dx find jobs
* BWA-MEM FASTQ Read Mapper (bwa_mem_fastq_read_mapper:main)(done) job-xxxx
user-amy 20xx-xx-xx 0x:00:00 (runtime 0:00:xx)
$ dx describe job-xxxx
...

There are also additional options that you can use to restrict your search of previous jobs, such as by their names or when they were run.

Terminating Your Job

After Your Job Finishes

You should now see two new files in your project: the mapped reads in a BAM file, and an index of that BAM file with a .bai extension. You can refer to the output file by name or by the job that produced it using the syntax job-xxxx:<output field>. Try it yourself with the job ID you got from calling the BWA-MEM app!

$ dx ls
small-celegans-sample.bam
small-celegans-sample.bam.bai
small-celegans-sample.fastq
$ dx describe small-celegans-sample.bam
...
$ dx describe job-xxxx:sorted_bam
...

Variant Calling

This time, we won't rely on the interactive mode to enter our inputs. Instead, we will provide them directly. But first, let's look up the app's spec so we know what the inputs are called. For this, let's run the command dx run freebayes -h.

$ dx run freebayes -h
usage: dx run freebayes [-iINPUT_NAME=VALUE ...]

App: FreeBayes Variant Caller

Calls variants (SNPs, indels, and other events) using FreeBayes

See the app page for more information:
https://platform.dnanexus.com/app/freebayes

Inputs:
    Sorted mappings: -isorted_bams=(file) [-isorted_bams=... [...]]
        One or more coordinate-sorted BAM files containing mappings to call
        variants for.

    Genome: -igenome_fastagz=(file)
        A file, in gzipped FASTA format, with the reference genome that the
        reads were mapped against.
...

Optional inputs are shown using square brackets ([]) around the command-line syntax for each input. You'll notice that there are two required inputs that must be specified:

  1. Sorted mappings (sorted_bams): A list of files with a .bam extension.

  2. Genome (genome_fastagz): A reference genome in FASTA format that has been gzipped.

You can also run dx describe freebayes for a more compact view of the input and output specifications. By default, it will hide the advanced input options, but you can view them using the --verbose flag.

Running the App with a One-Liner Using a Job-Based Object Reference

It is sometimes more convenient to run apps using a single one-line command. You can do this by specifying all the necessary inputs either via the command line or in a prepared file. We will use the -i flag to specify inputs as suggested by the output of dx run freebayes ‑h:

  • genome_fastagz: The ce10 genome in the Reference Genomes project.

To specify new job input using the output of a previous job, we'll use a [job-based object reference](/Developer-Tutorials/Sample-Code?bash#Use-job-based-object-references-(JBORs)) via the job-xxxx:<output field> syntax we used earlier.

You can use job-based object references as input even before the referenced jobs have finished. The system will simply wait until the input is ready to begin the new job.

Replace the job ID below with that generated by the BWA app you ran earlier. The -y flag skips the input confirmation.

$ dx run freebayes -y \ 
 -igenome_fastagz=Reference\ Genome\ Files:/C.\ Elegans\ -\ Ce10/ce10.fasta.gz \ 
 -isorted_bams=job-xxxx:sorted_bam

Using input JSON:
{
  "genome_fastagz": {
    "$dnanexus_link": {
      "project": "project-xxxx",
      "id": "file-xxxx"
    }
  },
  "sorted_bams": {
    "field": "sorted_bam", 
    "job": "job-xxxx"
  } 
}

Calling app-BFG5k2009PxyvYXBBJY00BK1 with output destination
project-xxxx:/

Job ID: job-xxxx

Automatically Running a Command After a Job Finishes

$ dx wait job-xxxx && dx find jobs
Waiting for job-xxxx to finish running...
Done
* FreeBayes Variant Caller (done) job-xxxx
user-amy 2017-01-01 09:00:00 (runtime 0:05:24)
...

Congratulations! You have now called variants on a reads sample, and you did it all on the command line. Now let's look at how you can automate this process.

Automation

The beauty of the CLI is the ability to automate processes. In fact, we can automate everything we just did. The following script assumes that you've already logged in and is hardcoded to use the ce10 genome and takes in a local gzipped FASTQ file as its command-line argument.

#!/usr/bin/env bash
# Usage: <script_name.sh> local_fastq_filename.fastq.gz

reference="Reference Genome Files\: AWS US (East):/C. Elegans - Ce10/ce10.fasta.gz"
bwa_indexed_reference="Reference Genome Files\: AWS US (East):/C. Elegans - Ce10/ce10.bwa-index.tar.gz"
local_reads_file="$1"

reads_file_id=$(dx upload "$local_reads_file" --brief)
bwa_job=$(dx run bwa_mem_fastq_read_mapper -ireads_fastqgzs=$reads_file_id -igenomeindex_targz="$bwa_indexed_reference" -y --brief)
freebayes_job=$(dx run freebayes -isorted_bams=$bwa_job:sorted_bam -igenome_fastagz="$reference" -y --brief)

dx wait $freebayes_job

dx download $freebayes_job:variants_vcfgz -o "$local_reads_file".vcf.gz
gunzip "$local_reads_file".vcf.gz

Learn More

To update your version of the command-line tool, you can run the command .

The first thing you'll need to do is to . If you haven't created a DNAnexus account yet, visit the and sign up. User signup is not supported on the command line.

Your and your current project settings have now been saved in a local configuration file, and you're ready to start accessing your project.

You can generate an authentication token from the online DNAnexus Platform .

By running the command and picking a project, you've now done the command-line equivalent of going to the project page for (platform login required to access this link) on the website. This is a DNAnexus-sponsored project containing popular genomes for you to use when running analyses with your own data.

For more information about the dx select command, please see the page.

Now you can list all of the data in the top-level directory of the project you've just selected by running the command . You can also see the contents of a folder by running the command dx ls <folder_name>.

You can use the command to learn more about on the platform. Given a DNAnexus object ID or name, dx describe will return detailed information about the object in question. dx describe will only return results for data objects to which you have access.

Now, we'll use the command to create a new project.

The text project-xxxx denotes a placeholder for a unique, immutable project ID. For more information about object IDs, see the page.

The new command can also allow you to create other new data objects, including new orgs or users. Use the command dx help new to see additional information. The full list of dx commands is provided .

If you have a sample you would like to analyze, you can use the command or the if you have installed it. For the purposes of this tutorial, you can also download the file , which represents the first 25000 C. elegans reads from SRR070372. We will use this file again later to run through a sample analysis.

For uploading multiple or large files, we strongly recommend that you use the ; it will compress your files and upload them in parallel over multiple HTTP connections and boasts other features such as resumable uploads.

The following command uploads the small-celegans-sample.fastq file into the current directory of the current project. The --wait flag tells to wait until it has finished uploading the data before returning the prompt and describing the result.

To take a quick look at the first few lines of the file you just uploaded, use the command. By default, it prints the first 10 lines of the given file.

If you'd like to download a file from the platform, just use the command. This command will use the name of the file for the filename unless you specify your own with the -o/--output flag. In the example below, we download the same C. elegans file that we uploaded previously.

For the next few steps, if you would like to follow along, you will need a C. elegans FASTQ file. We will map the reads against the ce10 genome. If you haven't already, you can download and use the following FASTQ file, which contains the first 25,000 reads from SRR070372: .

You can also substitute your own reads file for a different species (though it may take longer to run through the example). For your convenience, DNAnexus has already imported a variety of reference genomes to the platform. If you have your own FASTA file that you would like to use, you can upload the file and create genome indices for BWA using the (platform login required to access these links).

The following walkthrough is helpful if you would like to understand what all the commands do and take a look at what apps you're running, but if you're just interested in converting a gzipped FASTQ file to a VCF file via BWA and the FreeBayes variant caller, then you can skip ahead to the section below, where you can see all the commands necessary for running apps.

For more information about using the command , please see the page.

Next, use the (platform login required to access this link) to map the uploaded reads file to a reference genome.

You can navigate to its web page from the (platform login required to access this link) on the platform. The app's page will tell you how to run it from the command line. You can find more information about the app we're running on the (platform login required to access this link).

Now install the app using and check that it has been installed. While you do not always need to install an app to run it, you may find it useful as a bookmarking tool.

We can now run the app using . We will run it without any arguments; it will then prompt us for required and then optional arguments. Note that the reference file genomeindex_targz for the C. elegans sample we are using is in a .tar.gz format and can be found in the Reference Genome folder of the region your project is in.

You can use the command to monitor jobs. The command will print out the log file of the job, including the STDOUT, STDERR, and INFO printouts.

You can also use the command dx describe job-xxxx to learn more about your job. If you don't know the job's ID, you can use the command to list all the jobs run in the current project, along with the user who ran them, their status, and when they began.

If for some reason you need to terminate your job before it completes, use the command .

You can use the (platform login required to access this link) to call variants on your BAM file.

sorted_bams: The output of the previous BWA step (see the section for more information).

You can use the command to wait for a job to finish. If we run the following command right after running the Freebayes app, it will show you the recent jobs only after the job has finished, as shown in the example below.

You're now ready to start scripting using dx. As shown in some of the examples above, the --brief flag can come in handy for scripting. A list of all dx commands and flags is on the page.

For more detailed information about running apps and applets from the command line, see the page.

For a comprehensive guide to the DNAnexus SDK, see the .

Want to start writing your own apps? Check out the for some useful tutorials.

log in
website
using the UI
Entity IDs
here
Upload Agent
small-celegans-sample.fastq
BWA FASTA Indexer app
Automate It
BWA-MEM app
Apps page
BWA-MEM FASTQ Read Mapper page
FreeBayes Variant Caller app
Index of dx Commands
Running Apps and Applets
SDK documentation
Developer Portal
Map Reads
Follow these instructions to set up billing
index of dx commands
here.
Reference Genome Files: AWS US (East)
files and other objects
Upload Agent
small-celegans-sample.fastq
dx upload
Changing Your Current Project
authentication token
DNAnexus SDK (dx-toolkit)
dx upgrade
dx select
dx ls
dx describe
dx new project
dx upload
dx upload
dx head
dx download
dx upload
dx install
dx run
dx watch
dx find jobs
dx terminate
dx wait