Running DXJupyterLab

Learn to launch a JupyterLab session on the DNAnexus Platform, via the DXJupyterLab app.

DXJupyterLab is accessible to all users of the UK Biobank Research Analysis Platform and the Our Future Health Trusted Research Environment.

For DNAnexus Platform users, a license is required to access DXJupyterLab. Contact DNAnexus Sales for more information.

Running from the UI

1. Select Tools > JupyterLab from the Main Menu

If you have used DXJupyterLab before, the page will display a list of your previous sessions run across different projects.

2. Click on the New JupyterLab Button in the Top Right Corner

This will open a window from which you can start a new JupyterLab environment. In this window, you can configure your session, e.g. specify its name, select an instance type, and choose the project in which JupyterLab should be started.

If a snapshot file is provided, a DXJupyterLab environment saved previously will be loaded from that file. A snapshot tarball file can be created when running a JupyterLab session.

Snapshots created using older versions of DXJupyterLab are incompatible with the current version. See these guidelines if you need to use an older DXJupyterLab snapshot.

You can adjust the duration of the session, after which the environment will automatically shut down. Based on this duration and the instance type, the estimation of the price will be shown in the bottom-left corner (if you have access to the billing information for the selected project).

If you select Enable Spark Cluster, a JupyterLab environment with a standalone Spark cluster will be started. With this option, you can also set the number of nodes in the cluster. This number includes the master (one node) and the worker nodes.

The feature options available are PYTHON_R, ML, IMAGE_PROCESSING, and STATA. Selecting thePYTHON_R feature (default option) loads the environment with Python3 and R kernel and interpreter. Selecting the ML feature loads the environment with Python3 and Machine Learning packages such as TensorFlow, PyTorch, CNTK as well as Image Processing package Nipype but it does not contain R. Selecting the IMAGE_PROCESSING feature loads the environment with Python3 and Image Processing packages such as Nipype, FreeSurfer and FSL but it does not contain R. The FreeSurfer package requires a license to run. Details about License creation and usage can be found here. The STATA feature requires a license to run. For a detailed list of libraries included in each of these feature options, see the in-product documentation.

3. Initiate the Session by Clicking Start Environment

First, the JupyterLab will be in an "Initializing" state, where it waits for the worker to spin up and for the JupyterLab server to be up and running. Clicking on the row corresponding to your session and the i icon in the top right corner will display more information corresponding to the JupyterLab job.

4. Open a JupyterLab Environment in Your Browser When the State is Set to "Ready"

Once the JupyterLab server is running, the session state will change to Ready and the name of the session will turn into a link. By clicking this link, you can open a JupyterLab environment page in your browser. You can access your job via the URL https://job-xxxx.dnanexus.cloud, where job-xxxx is the ID of the DXJupyterLab's job.

Running DXJupyterLab from the CLI

You can start the JupyterLab environment directly from the command line by running the app:

$ dx run app-dxjupyterlab

Once the app starts, you may check if the JupyterLab server is ready to server connections, which will be indicated by the job's property httpsAppState set to running. Once it is running, you can open your browser and go to:

https://job-xxxx.dnanexus.cloud

where job-xxxx is the ID of the job running the app.

In order to run the Spark version of the app, use the command:

$ dx run app-dxjupyterlab_spark_cluster

You can check the optional input parameters for the apps on the DNAnexus platform (platform login required to access the links):

From the CLI, you can learn more about dx run with the following command:

$ dx run -h APP_NAME

where APP_NAME is either app-dxjupyterlab or app-dxjupyterlab_spark_cluster.

Next Steps

Last updated