DXJupyterLab Reference
This page is a reference for most useful operations and features in the DNAnexus JupyterLab environment:
%%bash
dx download input_data/reads.fastq
The
%%bash
keyword converts the whole cell to a magic cell which allows us to run bash code in that cell without exiting the Python kernel. See me examples of magic commands in the IPython documentation. The !
prefix to achieves the same result:! dx download input_data/reads.fastq
import dxpy
dxpy.download_dxfile(dxid='file-xxxx',
filename='unique_name.txt')
%%bash
dx upload Readme.ipynb
import dxpy
dxpy.upload_local_file('variants.vcf')
By selecting a notebook or any other file on your computer and dragging it into the DNAnexus project file browser, you can upload the files directly to the project. To download a file, right-click on it and click
Download (to local computer)
.You may upload and download data to the local execution environment in a similar way, i.e. by dragging and dropping files to the execution file browser or by right-clicking on the files there and clicking
Download
.It is useful to have a terminal provided by JupyterLab at hand, which uses
bash
shell by default and lets you execute shell scripts or interact with the platform via dx
toolkit. For example, the command:$ dx pwd
MyProject:/
will confirm what the current project context is.
Running
pwd
will show you that the working directory of the execution environment is /opt/notebooks
. The JupyterLab server is launched from this directory, which is also the default location of the output files generated in the notebooks.To open a terminal window, go to
File
> New
> Terminal
or open it from the Launcher (using the "Terminal" box at the bottom). To open a Launcher, select File
> New Launcher
.You can install
pip
, conda
, apt-get
, and other packages in the execution environment from the notebook:%%bash
pip install torch
pip install torchvision
conda install -c conda-forge opencv
By creating a snapshot, you can start subsequent sessions with these packages pre-installed by providing the snapshot as input.
You can access public github repositories from the JupyterLab terminal using
git clone
command. By placing a private ssh key that's registered with your github account in /root/.ssh/id_rsa,
you can clone private github repositories using git clone
and push any changes back to github using git push
from the JupyterLab terminal.Below is a screenshot of a JupyterLab session with a terminal displaying a script that:
- sets up ssh key to access a private github repository and clones it,
- clones a public repository,
- downloads a json file from the DNAnexus project,
- modifies an open-source notebook to convert the json file to csv format,
- saves the modified notebook to the private github repository,
- and uploads the results of json to csv conversion back to the DNAnexus project.

This animation shows the first part of the script in action:

A command can be run in the JupyterLab Docker container without starting an interactive JupyterLab server. To do that, provide the
cmd
input and additional input files using the in
input file array. The command will run in the directory where the JupyterLab server is started and notebooks are run, i.e. /opt/notebooks/
. Any output files generated in this directory will be uploaded to the project and returned in the out
output.The cmd input makes it possible to use a
papermill
tool pre-installed in the JupyterLab environment that executes notebooks non-interactively. For example, to execute all the cells in a notebook and produce an output notebook:my_cmd="papermill notebook.ipynb output_notebook.ipynb"
dx run dxjupyterlab -icmd="$my_cmd" -iin="notebook.ipynb"
where notebook.ipynb is the input notebook to "papermill", which needs to be passed in the "in" input, and output_notebook.ipynb is the name of the output notebook, which will store the result of the cells' execution. The output will be uploaded to the project at the end of the app execution.
If the
snapshot
parameter is specified, execution of cmd will take place in the specified Docker container. The duration
argument will be ignored when running the app with cmd
. The app can be run from commandline with the --extra-args flag to limit the runtime, e.g. dx run dxjupyterlab --extra-args '{"timeoutPolicyByExecutable": {"app-xxxx":{"\*": {"hours": 1}}}}'"
.If
cmd
is not specified, the in
parameter will be ignored and the output of an app will consist of an empty array.If you are away from the JupyterLab browser tabs for 15 to 30 minutes, you will be automatically logged out from the JupyterLab session and JupyterLab tabs will display "Server Connection Error" message. You can re-enter the JupyterLab session by simply reloading the JupyterLab webpage and logging into the platform, which will redirect you back to the JupyterLab session.
Last modified 1mo ago