DNAnexus Documentation
APIDownloadsIndex of dx CommandsLegal
  • Overview
  • Getting Started
    • DNAnexus Essentials
    • Key Concepts
      • Projects
      • Organizations
      • Apps and Workflows
    • User Interface Quickstart
    • Command Line Quickstart
    • Developer Quickstart
    • Developer Tutorials
      • Bash
        • Bash Helpers
        • Distributed by Chr (sh)
        • Distributed by Region (sh)
        • SAMtools count
        • TensorBoard Example Web App
        • Git Dependency
        • Mkfifo and dx cat
        • Parallel by Region (sh)
        • Parallel xargs by Chr
        • Precompiled Binary
        • R Shiny Example Web App
      • Python
        • Dash Example Web App
        • Distributed by Region (py)
        • Parallel by Chr (py)
        • Parallel by Region (py)
        • Pysam
      • Web App(let) Tutorials
        • Dash Example Web App
        • TensorBoard Example Web App
      • Concurrent Computing Tutorials
        • Distributed
          • Distributed by Region (sh)
          • Distributed by Chr (sh)
          • Distributed by Region (py)
        • Parallel
          • Parallel by Chr (py)
          • Parallel by Region (py)
          • Parallel by Region (sh)
          • Parallel xargs by Chr
  • User
    • Login and Logout
    • Projects
      • Project Navigation
      • Path Resolution
    • Running Apps and Workflows
      • Running Apps and Applets
      • Running Workflows
      • Running Nextflow Pipelines
      • Running Batch Jobs
      • Monitoring Executions
      • Job Notifications
      • Job Lifecycle
      • Executions and Time Limits
      • Executions and Cost and Spending Limits
      • Smart Reuse (Job Reuse)
      • Apps and Workflows Glossary
      • Tools List
    • Cohort Browser
      • Chart Types
        • Row Chart
        • Histogram
        • Box Plot
        • List View
        • Grouped Box Plot
        • Stacked Row Chart
        • Scatter Plot
        • Kaplan-Meier Survival Curve
      • Locus Details Page
    • Using DXJupyterLab
      • DXJupyterLab Quickstart
      • Running DXJupyterLab
        • FreeSurfer in DXJupyterLab
      • Spark Cluster-Enabled DXJupyterLab
        • Exploring and Querying Datasets
      • Stata in DXJupyterLab
      • Running Older Versions of DXJupyterLab
      • DXJupyterLab Reference
    • Using Spark
      • Apollo Apps
      • Connect to Thrift
      • Example Applications
        • CSV Loader
        • SQL Runner
        • VCF Loader
      • VCF Preprocessing
    • Environment Variables
    • Objects
      • Describing Data Objects
      • Searching Data Objects
      • Visualizing Data
      • Filtering Objects and Jobs
      • Archiving Files
      • Relational Database Clusters
      • Symlinks
      • Uploading and Downloading Files
        • Small File Sets
          • dx upload
          • dx download
        • Batch
          • Upload Agent
          • Download Agent
    • Platform IDs
    • Organization Member Guide
    • Index of dx commands
  • Developer
    • Developing Portable Pipelines
      • dxCompiler
    • Cloud Workstation
    • Apps
      • Introduction to Building Apps
      • App Build Process
      • Advanced Applet Tutorial
      • Bash Apps
      • Python Apps
      • Spark Apps
        • Table Exporter
        • DX Spark Submit Utility
      • HTTPS Apps
        • Isolated Browsing for HTTPS Apps
      • Transitioning from Applets to Apps
      • Third Party and Community Apps
        • Community App Guidelines
        • Third Party App Style Guide
        • Third Party App Publishing Checklist
      • App Metadata
      • App Permissions
      • App Execution Environment
        • Connecting to Jobs
      • Dependency Management
        • Asset Build Process
        • Docker Images
        • Python package installation in Ubuntu 24.04 AEE
      • Job Identity Tokens for Access to Clouds and Third-Party Services
      • Enabling Web Application Users to Log In with DNAnexus Credentials
      • Types of Errors
    • Workflows
      • Importing Workflows
      • Introduction to Building Workflows
      • Building and Running Workflows
      • Workflow Build Process
      • Versioning and Publishing Global Workflows
      • Workflow Metadata
    • Ingesting Data
      • Molecular Expression Assay Loader
        • Common Errors
        • Example Usage
        • Example Input
      • Data Model Loader
        • Data Ingestion Key Steps
        • Ingestion Data Types
        • Data Files Used by the Data Model Loader
        • Troubleshooting
      • Dataset Extender
        • Using Dataset Extender
    • Dataset Management
      • Rebase Cohorts and Dashboards
      • Assay Dataset Merger
      • Clinical Dataset Merger
    • Apollo Datasets
      • Dataset Versions
      • Cohorts
    • Creating Custom Viewers
    • Client Libraries
      • Support for Python 3
    • Walkthroughs
      • Creating a Mixed Phenotypic Assay Dataset
      • Guide for Ingesting a Simple Four Table Dataset
    • DNAnexus API
      • Entity IDs
      • Protocols
      • Authentication
      • Regions
      • Nonces
      • Users
      • Organizations
      • OIDC Clients
      • Data Containers
        • Folders and Deletion
        • Cloning
        • Project API Methods
        • Project Permissions and Sharing
      • Data Object Lifecycle
        • Types
        • Object Details
        • Visibility
      • Data Object Metadata
        • Name
        • Properties
        • Tags
      • Data Object Classes
        • Records
        • Files
        • Databases
        • Drives
        • DBClusters
      • Running Analyses
        • I/O and Run Specifications
        • Instance Types
        • Job Input and Output
        • Applets and Entry Points
        • Apps
        • Workflows and Analyses
        • Global Workflows
        • Containers for Execution
      • Search
      • System Methods
      • Directory of API Methods
      • DNAnexus Service Limits
  • Administrator
    • Billing
    • Org Management
    • Single Sign-On
    • Audit Trail
    • Integrating with External Services
    • Portal Setup
    • GxP
      • Controlled Tool Access (allowed executables)
  • Science Corner
    • Scientific Guides
      • Somatic Small Variant and CNV Discovery Workflow Walkthrough
      • SAIGE GWAS Walkthrough
      • LocusZoom DNAnexus App
      • Human Reference Genomes
    • Using Hail to Analyze Genomic Data
    • Open-Source Tools by DNAnexus Scientists
    • Using IGV Locally with DNAnexus
  • Downloads
  • FAQs
    • EOL Documentation
      • Python 3 Support and Python 2 End of Life (EOL)
    • Automating Analysis Workflow
    • Backups of Customer Data
    • Developing Apps and Applets
    • Importing Data
    • Platform Uptime
    • Legal and Compliance
    • Sharing and Collaboration
    • Product Version Numbering
  • Release Notes
  • Technical Support
  • Legal
Powered by GitBook

Copyright 2025 DNAnexus

On this page
  • Overview
  • Quickstart
  • Step 1. Create a Symlink Drive
  • Step 2. Linking a Project with a Symlink Drive
  • Step 3. Enable CORS
  • Working with Symlinked Files
  • Renaming Symlinks
  • Deleting Symlinks
  • Working with Symlink Drives
  • Updating Cloud Service Access Credentials
  • Learn More
  • FAQ
  • What happens if I move a symlinked file from one folder to another, within a DNAnexus project? Will the file also mirror that move within the AWS S3 bucket or Azure blob?
  • What happens if I delete a symlinked file directly on S3 or Azure blob storage, and a job tries to access the symlinked object on DNAnexus?
  • Can I copy a symlinked file from one project to another and still retain access?
  • Can I create a symlink in another region relative to my project’s region?
  • What if I upload a file to my auto-symlink-enabled project with a filename that already matches the name of a file in the S3 bucket or Azure blob linked to the project?
  • What happens if I try to transfer billing responsibility of an auto-symlink-enabled project to someone else?

Was this helpful?

Export as PDF
  1. User
  2. Objects

Symlinks

Use Symlinks to access, work with, and modify files that are stored on an external cloud service.

Last updated 8 days ago

Was this helpful?

A license is required to use Symlinks. for more information.

Overview

The DNAnexus Symlinks feature enables users to link external data files on AWS S3 and Azure blob storage as objects on the platform and access such objects for any usage as though they are native DNAnexus file objects.

No storage costs are incurred when using symlinked files on the Platform. When used by jobs, symlinked files are downloaded to the Platform at runtime.

DNAnexus validates the integrity of symlinked files on the DNAnexus Platform using recorded md5 checksums. But DNAnexus cannot control or monitor changes made to these files in a customer's cloud storage. It is each customer's responsibility to safeguard files from any modifications, removals, and security breaches, while in the customer's cloud storage.

Quickstart

Symlinked files stored in AWS S3 or Azure blob storage are made accessible on DNAnexus via a Symlink Drive. The drive contains the necessary cloud storage credentials, and can be created by following Step 1 below.

Step 1. Create a Symlink Drive

To set up Symlink Drives, use the CLI to provide the following information:

  • A name for the Symlink Drive

  • The cloud service (AWS or Azure) where your files are stored

  • The access credentials required by the service

AWS

dx api drive new '{
    "name" : "<drive_name>",
    "cloud" : "aws",
    "credentials" : {
        "accessKeyId" : "<my_aws_access_key>",
        "secretAccessKey" : "<my_aws_secret_access_key>"
    }
}'

Azure

dx api drive new '{
    "name" : "<drive_name>",
    "cloud" : "azure",
    "credentials" : {
        "account" : "<my_azure_storage_account_name>",
        "key" : "<my_azure_storage_access_key>"
    }
}'

After you've entered the appropriate command, a new drive object will be created. You'll see a confirmation message that includes the id of the new Symlink Drive in the format drive-xxxx.

Step 2. Linking a Project with a Symlink Drive

By associating a DNAnexus Platform project with a Symlink Drive, you can both:

  • Have all new project files automatically uploaded to the AWS S3 bucket or Azure blob, to which the Drive links

  • Enable project members to work with those files

Note that "new project files" includes all of the following:

  • Newly created files

  • File outputs from jobs

  • Files uploaded to the project

Note that non-symlinked files cloned into a symlinked project will not be uploaded to the linked AWS S3 bucket or Azure blob.

Linking a New Project with a Symlink Drive via the UI

When creating a new project via the UI, you can link it with an existing Symlink Drive by toggling the Enable Auto-Symlink in This Project setting to "On":

Next:

  • In the Symlink Drive field, select the drive with which the project should be linked

  • In the Container field, enter the name of the AWS S3 bucket or Azure blob where newly created files should be stored

  • Optionally, in the Prefix field, enter the name of a folder within the AWS S3 bucket or Azure blob where these files should be stored

Linking a New Project with a Symlink Drive via the CLI

Step 3. Enable CORS

In order to ensure that files can be saved to your AWS S3 bucket or Azure blob, you must enable for that remote storage container.

Enabling CORS for an AWS S3 bucket

Use the following JSON object when configuring CORS for the bucket:

[
    {
        "AllowedHeaders": [
            "Content-Length",
            "Origin",
            "Content-MD5",
            "accept",
            "content-type"
        ],
        "AllowedMethods": [
            "PUT",
            "POST"
        ],
        "AllowedOrigins": [
            "https://*"
        ],
        "ExposeHeaders": [
            "Retry-After"
        ],
        "MaxAgeSeconds": 3600
    }
]

Enabling CORS for an Azure S3 Blob

Working with Symlinked Files

Working with Symlinked files is largely the same as working with files that are stored on the Platform. These files can, for example, be used as inputs to apps, applets, or workflows.

Renaming Symlinks

If you rename a symlink on DNAnexus, this does not change the name of the file in S3 or Azure blob storage. Note that in this example, the symlink has been renamed from the original name file.txt , to Example File . The remote filename, as shown in the Remote Path field in the right-side info pane, remains file.txt:

Deleting Symlinks

If you delete a symlink on the Platform, the file to which it points is not deleted.

Working with Symlink Drives

Updating Cloud Service Access Credentials

If your cloud access credentials change, you must update the definition of all Symlink Drives to keep using files to which those Drives provide access.

AWS

To update a drive definition with new AWS access credentials, use the following command:

dx api <driveID> update '{
    "credentials" : {
        "accessKeyId" : "<my_new_aws_access_key>",
        "secretAccessKey" : "<my_new_aws_secret_access_key>"
    }
}'

Azure

To update a drive definition with new Azure access credentials, use the following command:

dx api <driveID> update '{
    "credentials" : {
        "account" : "<my_azure_storage_account_name>",
        "key" : "<my_azure_storage_access_key>"
    }
}'

Learn More

FAQ

What happens if I move a symlinked file from one folder to another, within a DNAnexus project? Will the file also mirror that move within the AWS S3 bucket or Azure blob?

No, the symlinked file will only move within the project. The change will not be mirrored in the linked S3 or Azure blob container.

What happens if I delete a symlinked file directly on S3 or Azure blob storage, and a job tries to access the symlinked object on DNAnexus?

The job will fail after it is unable to retrieve the source file.

Can I copy a symlinked file from one project to another and still retain access?

Yes, you can copy a symlinked file from one project to another. This includes copying symlinked files from a symlink-enabled project to a project without this feature enabled.

Can I create a symlink in another region relative to my project’s region?

Yes - egress charges will be incurred.

What if I upload a file to my auto-symlink-enabled project with a filename that already matches the name of a file in the S3 bucket or Azure blob linked to the project?

In this scenario, the uploaded file will overwrite, or "clobber," the file that shares its name, and only the newly uploaded file will be stored in the AWS S3 bucket or Azure blob.

This is true even if, within your project, you first renamed the symlinked file and uploaded a new file with the prior name. For example, if you upload a file named file.txt to your DNAnexus project, the file will be automatically uploaded to your S3 or Azure blob to the specified directory. If you then rename the file on DNAnexus from file.txt to file.old.txt, and upload a new file to the project called file.txt, the original file.txt that was uploaded to S3 or Azure blob will be overwritten. However, you will still be left with file.txt and file.old.txt symlinks in your DNAnexus project. Trying to access the original file.old.txt symlink will likely result in a checksum error.

What happens if I try to transfer billing responsibility of an auto-symlink-enabled project to someone else?

If the auto-symlink feature has been enabled for a project, billing responsibility for the project cannot be transferred. Attempting to do so via API call will return a PermissionDenied error.

Note that when your cloud service access credentials change, you must update the definition of each Symlink Drive that links to the cloud service in question. See .

When creating a new project via the CLI, you can link it to a Symlink Drive by using the optional argument --default-symlink with dx new project. See the for details on inputs and input format.

Refer to Amazon documentation for .

Refer to Microsoft documentation for .

For more information, see .

Contact DNAnexus Sales
guidance on enabling CORS for an S3 bucket
guidance on enabling CORS for Azure Storage
API endpoints for working with Symlink Drives
Updating Cloud Service Access Credentials
dx new project documentation