DNAnexus Documentation
APIDownloadsIndex of dx CommandsLegal
  • Overview
  • Getting Started
    • DNAnexus Essentials
    • Key Concepts
      • Projects
      • Organizations
      • Apps and Workflows
    • User Interface Quickstart
    • Command Line Quickstart
    • Developer Quickstart
    • Developer Tutorials
      • Bash
        • Bash Helpers
        • Distributed by Chr (sh)
        • Distributed by Region (sh)
        • SAMtools count
        • TensorBoard Example Web App
        • Git Dependency
        • Mkfifo and dx cat
        • Parallel by Region (sh)
        • Parallel xargs by Chr
        • Precompiled Binary
        • R Shiny Example Web App
      • Python
        • Dash Example Web App
        • Distributed by Region (py)
        • Parallel by Chr (py)
        • Parallel by Region (py)
        • Pysam
      • Web App(let) Tutorials
        • Dash Example Web App
        • TensorBoard Example Web App
      • Concurrent Computing Tutorials
        • Distributed
          • Distributed by Region (sh)
          • Distributed by Chr (sh)
          • Distributed by Region (py)
        • Parallel
          • Parallel by Chr (py)
          • Parallel by Region (py)
          • Parallel by Region (sh)
          • Parallel xargs by Chr
  • User
    • Login and Logout
    • Projects
      • Project Navigation
      • Path Resolution
    • Running Apps and Workflows
      • Running Apps and Applets
      • Running Workflows
      • Running Nextflow Pipelines
      • Running Batch Jobs
      • Monitoring Executions
      • Job Notifications
      • Job Lifecycle
      • Executions and Time Limits
      • Executions and Cost and Spending Limits
      • Smart Reuse (Job Reuse)
      • Apps and Workflows Glossary
      • Tools List
    • Cohort Browser
      • Chart Types
        • Row Chart
        • Histogram
        • Box Plot
        • List View
        • Grouped Box Plot
        • Stacked Row Chart
        • Scatter Plot
        • Kaplan-Meier Survival Curve
      • Locus Details Page
    • Using DXJupyterLab
      • DXJupyterLab Quickstart
      • Running DXJupyterLab
        • FreeSurfer in DXJupyterLab
      • Spark Cluster-Enabled DXJupyterLab
        • Exploring and Querying Datasets
      • Stata in DXJupyterLab
      • Running Older Versions of DXJupyterLab
      • DXJupyterLab Reference
    • Using Spark
      • Apollo Apps
      • Connect to Thrift
      • Example Applications
        • CSV Loader
        • SQL Runner
        • VCF Loader
      • VCF Preprocessing
    • Environment Variables
    • Objects
      • Describing Data Objects
      • Searching Data Objects
      • Visualizing Data
      • Filtering Objects and Jobs
      • Archiving Files
      • Relational Database Clusters
      • Symlinks
      • Uploading and Downloading Files
        • Small File Sets
          • dx upload
          • dx download
        • Batch
          • Upload Agent
          • Download Agent
    • Platform IDs
    • Organization Member Guide
    • Index of dx commands
  • Developer
    • Developing Portable Pipelines
      • dxCompiler
    • Cloud Workstation
    • Apps
      • Introduction to Building Apps
      • App Build Process
      • Advanced Applet Tutorial
      • Bash Apps
      • Python Apps
      • Spark Apps
        • Table Exporter
        • DX Spark Submit Utility
      • HTTPS Apps
        • Isolated Browsing for HTTPS Apps
      • Transitioning from Applets to Apps
      • Third Party and Community Apps
        • Community App Guidelines
        • Third Party App Style Guide
        • Third Party App Publishing Checklist
      • App Metadata
      • App Permissions
      • App Execution Environment
        • Connecting to Jobs
      • Dependency Management
        • Asset Build Process
        • Docker Images
        • Python package installation in Ubuntu 24.04 AEE
      • Job Identity Tokens for Access to Clouds and Third-Party Services
      • Enabling Web Application Users to Log In with DNAnexus Credentials
      • Types of Errors
    • Workflows
      • Importing Workflows
      • Introduction to Building Workflows
      • Building and Running Workflows
      • Workflow Build Process
      • Versioning and Publishing Global Workflows
      • Workflow Metadata
    • Ingesting Data
      • Molecular Expression Assay Loader
        • Common Errors
        • Example Usage
        • Example Input
      • Data Model Loader
        • Data Ingestion Key Steps
        • Ingestion Data Types
        • Data Files Used by the Data Model Loader
        • Troubleshooting
      • Dataset Extender
        • Using Dataset Extender
    • Dataset Management
      • Rebase Cohorts and Dashboards
      • Assay Dataset Merger
      • Clinical Dataset Merger
    • Apollo Datasets
      • Dataset Versions
      • Cohorts
    • Creating Custom Viewers
    • Client Libraries
      • Support for Python 3
    • Walkthroughs
      • Creating a Mixed Phenotypic Assay Dataset
      • Guide for Ingesting a Simple Four Table Dataset
    • DNAnexus API
      • Entity IDs
      • Protocols
      • Authentication
      • Regions
      • Nonces
      • Users
      • Organizations
      • OIDC Clients
      • Data Containers
        • Folders and Deletion
        • Cloning
        • Project API Methods
        • Project Permissions and Sharing
      • Data Object Lifecycle
        • Types
        • Object Details
        • Visibility
      • Data Object Metadata
        • Name
        • Properties
        • Tags
      • Data Object Classes
        • Records
        • Files
        • Databases
        • Drives
        • DBClusters
      • Running Analyses
        • I/O and Run Specifications
        • Instance Types
        • Job Input and Output
        • Applets and Entry Points
        • Apps
        • Workflows and Analyses
        • Global Workflows
        • Containers for Execution
      • Search
      • System Methods
      • Directory of API Methods
      • DNAnexus Service Limits
  • Administrator
    • Billing
    • Org Management
    • Single Sign-On
    • Audit Trail
    • Integrating with External Services
    • Portal Setup
    • GxP
      • Controlled Tool Access (allowed executables)
  • Science Corner
    • Scientific Guides
      • Somatic Small Variant and CNV Discovery Workflow Walkthrough
      • SAIGE GWAS Walkthrough
      • LocusZoom DNAnexus App
      • Human Reference Genomes
    • Using Hail to Analyze Genomic Data
    • Open-Source Tools by DNAnexus Scientists
    • Using IGV Locally with DNAnexus
  • Downloads
  • FAQs
    • EOL Documentation
      • Python 3 Support and Python 2 End of Life (EOL)
    • Automating Analysis Workflow
    • Backups of Customer Data
    • Developing Apps and Applets
    • Importing Data
    • Platform Uptime
    • Legal and Compliance
    • Sharing and Collaboration
    • Product Version Numbering
  • Release Notes
  • Technical Support
  • Legal
Powered by GitBook

Copyright 2025 DNAnexus

On this page
  • Download Files from the Project to the Local Execution Environment
  • Bash
  • Python
  • Upload Data from the Session to the Project
  • Bash
  • Python
  • Download and Upload Data to Your Local Machine
  • Use the Terminal
  • Install Custom Packages in the Session Environment
  • Access Public and Private Github Repositories from the JupyterLab Terminal
  • Run Notebooks Non-Interactively
  • Use newer NVIDIA GPU-accelerated software
  • Session Inactivity

Was this helpful?

Export as PDF
  1. User
  2. Using DXJupyterLab

DXJupyterLab Reference

This page is a reference for most useful operations and features in the DNAnexus JupyterLab environment.

Last updated 2 days ago

Was this helpful?

DXJupyterLab is accessible to all users of the UK Biobank Research Analysis Platform and the Our Future Health Trusted Research Environment.

A license is required to access DXJupyterLab on the DNAnexus Platform. for more information.

Download Files from the Project to the Local Execution Environment

Bash

You can download input data from a project using in a notebook cell:

%%bash
dx download input_data/reads.fastq

The %%bash keyword converts the whole cell to a magic cell which allows us to run bash code in that cell without exiting the Python kernel. See me examples of magic commands in the . The ! prefix to achieves the same result:

! dx download input_data/reads.fastq

Alternatively, the dx command can be executed from the .

Python

To download data with Python in the notebook, you can use the function:

import dxpy
dxpy.download_dxfile(dxid='file-xxxx',
                     filename='unique_name.txt')

Check dxpy for details on how to download files and folders.

Upload Data from the Session to the Project

Bash

%%bash
dx upload Readme.ipynb

Python

import dxpy
dxpy.upload_local_file('variants.vcf')

Download and Upload Data to Your Local Machine

By selecting a notebook or any other file on your computer and dragging it into the DNAnexus project file browser, you can upload the files directly to the project. To download a file, right-click on it and click Download (to local computer).

Use the Terminal

It is useful to have a terminal provided by JupyterLab at hand, which uses bash shell by default and lets you execute shell scripts or interact with the platform via dx toolkit. For example, the command:

$ dx pwd
MyProject:/

will confirm what the current project context is.

Running pwd will show you that the working directory of the execution environment is /opt/notebooks. The JupyterLab server is launched from this directory, which is also the default location of the output files generated in the notebooks.

To open a terminal window, go to File > New > Terminal or open it from the Launcher (using the "Terminal" box at the bottom). To open a Launcher, select File > New Launcher.

Install Custom Packages in the Session Environment

You can install pip, conda, apt-get, and other packages in the execution environment from the notebook:

%%bash
pip install torch
pip install torchvision
conda install -c conda-forge opencv

Access Public and Private Github Repositories from the JupyterLab Terminal

You can access public github repositories from the JupyterLab terminal using git clone command. By placing a private ssh key that's registered with your github account in /root/.ssh/id_rsa,you can clone private github repositories using git clone and push any changes back to github using git push from the JupyterLab terminal.

Below is a screenshot of a JupyterLab session with a terminal displaying a script that:

  • sets up ssh key to access a private github repository and clones it,

  • clones a public repository,

  • downloads a json file from the DNAnexus project,

  • modifies an open-source notebook to convert the json file to csv format,

  • saves the modified notebook to the private github repository,

  • and uploads the results of json to csv conversion back to the DNAnexus project.

This animation shows the first part of the script in action:

Run Notebooks Non-Interactively

A command can be run in the JupyterLab Docker container without starting an interactive JupyterLab server. To do that, provide the cmd input and additional input files using the in input file array. The command will run in the directory where the JupyterLab server is started and notebooks are run, i.e. /opt/notebooks/. Any output files generated in this directory will be uploaded to the project and returned in the out output.

The cmd input makes it possible to use a papermill tool pre-installed in the JupyterLab environment that executes notebooks non-interactively. For example, to execute all the cells in a notebook and produce an output notebook:

my_cmd="papermill notebook.ipynb output_notebook.ipynb"
dx run dxjupyterlab -icmd="$my_cmd" -iin="notebook.ipynb"

where notebook.ipynb is the input notebook to "papermill", which needs to be passed in the "in" input, and output_notebook.ipynb is the name of the output notebook, which will store the result of the cells' execution. The output will be uploaded to the project at the end of the app execution.

If thesnapshot parameter is specified, execution of cmd will take place in the specified Docker container. The duration argument will be ignored when running the app with cmd. The app can be run from commandline with the --extra-args flag to limit the runtime, e.g. dx run dxjupyterlab --extra-args '{"timeoutPolicyByExecutable": {"app-xxxx":{"\*": {"hours": 1}}}}'".

If cmd is not specified, the in parameter will be ignored and the output of an app will consist of an empty array.

Use newer NVIDIA GPU-accelerated software

# nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.129.06   Driver Version: 470.129.06   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
// Let's upgrade CUDA 11.4 to 12.5
# apt-get update
# apt-get -y install cuda-toolkit-12-5 cuda-compat-12-5
# echo /usr/local/cuda/compat > /etc/ld.so.conf.d/nvidia-compat.conf
# ldconfig
# nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.256.02   Driver Version: 470.256.02   CUDA Version: 12.5     |
|-------------------------------+----------------------+----------------------+
// CUDA 12.5 is now usable from terminal and notebooks

Session Inactivity

If you are away from the JupyterLab browser tabs for 15 to 30 minutes, you will be automatically logged out from the JupyterLab session and JupyterLab tabs will display "Server Connection Error" message. You can re-enter the JupyterLab session by simply reloading the JupyterLab webpage and logging into the platform, which will redirect you back to the JupyterLab session.

Any files from the execution environment can be uploaded to the project using :

To upload data using Python in the notebook, you can use the function:

Check dxpy for details on how to upload files and folders.

You may upload and download data to the in a similar way, i.e. by dragging and dropping files to the execution file browser or by right-clicking on the files there and clicking Download.

By creating a , you can start subsequent sessions with these packages pre-installed by providing the snapshot as input.

If you are trying to use newer NVIDIA GPU-accelerated software, you may find that the NVIDIA GPU Driver kernel-mode driver nvidia.ko that is installed outside of the DXJupyterLab environment does not support the newer CUDA version required by your application. You can install packages to use the newer CUDA version required by your application by following the steps below in a DXJupyterLab terminal.

dx upload
upload_local_file
helper functions
NVIDIA Forward Compatibility
Contact DNAnexus Sales
IPython documentation
download_dxfile
helper functions
terminal
local execution environment
snapshot
dx download