Cloud Workstation
Running the Cloud Workstation App using SSH.
The Cloud Workstation App (platform login required to access this link) sets up a job where you can SSH into the worker running the job and use the worker as a "Workstation" to explore and manipulate data stored on DNAnexus as you would on a local Linux machine.
The benefit of using this Workstation App as opposed to running on your local machine is that in the workstation, you will be able to access data stored on DNAnexus without downloading the files to your local machine and being constrained by your local internet bandwidth. In addition, you can configure the applet to launch more powerful instance types (virtual computer configurations) available to DNAnexus users. Any files or results you may want to save from your workstation session can simply be uploaded back into the project from which you launched your app.
The Cloud Workstation App provides basic functionality such as access to all your data, network access to download public tools and can be run as is.
If you would like to customize your Cloud Workstation experience, we also provide the source code of the app so you can build your own version of the workstation.
You can only give SSH access permissions and access the interactive worker via the command-line client. Download and install it if you have not done so already.
If you haven't already, you will need to configure your account to allow use of SSH connections using
dx ssh_config
. For more information on configuring your account and connecting to jobs, click here.To run the Cloud Workstation App and SSH into the terminal, navigate to the project you would like to work in. You will need CONTRIBUTE or ADMINISTER access to run the app in that project.
$ dx select "my-working-project"
Run the
dx
command shown in the code block below. The --ssh
flag will automatically configure the job to allow SSH access and connect to it after launching. This applet takes as input a maximum session length (in minutes).$ dx run app-cloud_workstation --ssh
Select an optional parameter to set by its # (^D or <ENTER> to finish):
[0] Maximum Session Length (suffixes allowed: s, m, h, d, w, M, y) (max_session_length) [default="1h"]
[1] Files (fids)
Optional param #: 0
Input: Maximum Session Length (suffixes allowed: s, m, h, d, w, M, y) (max_session_length)
Class: string
Enter string value ('?' for more options)
max_session_length: 3h
Select an optional parameter to set by its # (^D or <ENTER> to finish):
[0] Maximum Session Length (suffixes allowed: s, m, h, d, w, M, y) (max_session_length) [="3h"]
[1] Files (fids)
Optional param #: <ENTER>
Upon confirmation of input, you will be connected to the worker running the cloud workstation app and shown the following message:
Calling app-cloud_workstation with output destination
project-xxxx:/
Job ID: job-xxxx
Waiting for job-xxxx to start......
Resolving job hostname and SSH host key...........................
Checking connectivity to ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com...OK
Connecting to ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com
Welcome to DNAnexus!
This is the DNAnexus Execution Environment, running job-xxxx.
Job: Cloud Workstation
App: cloud_workstation:main
Instance type: mem1_ssd1_v2_x8
Project: emiai (project-xxxx)
Workspace: container-xxxx
Running since: Mon Nov 1 17:40:51 UTC 2021
Running for: 0:00:38
The public address of this instance is ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com.
You are running byobu, a terminal session manager.
If you get disconnected from this instance, you can log in again; your work will be saved as long as the job is running.
For more information on byobu, press F1.
The job is running in terminal 1. To switch to it, use the F4 key (fn+F4 on Macs; press F4 again to switch back to this terminal).
Use sudo to run administrative commands.
From this window, you can:
- Use the DNAnexus API with dx
- Monitor processes on the worker with htop
- Install packages with apt-get install or pip3 install
- Use this instance as a general-purpose Linux workstation
OS version: Ubuntu 20.04.3 LTS (GNU/Linux 5.4.0-1056-aws x86_64)
Due to differences in the execution environment, in order to upload or download files from your parent projects, you must first run the following two commands in your workstation:
$ unset DX_WORKSPACE_ID
$ dx cd $DX_PROJECT_CONTEXT_ID:
The first command unsets an environment variable which is set when the applet is launched and allows you to navigate into any of the projects you have access to. The second command is an invocation of
dx cd
to change the working directory of your workstation to the parent project (the only project your workstation has CONTRIBUTE access to). For more information about the environment variables in the job container, please visit the Execution Environment Reference.The workstation should now be ready to use.
For more information about how to use snapshots on the Cloud Workstation, see the app documentation here.
This app is configured to have VIEW access to all projects that the user running the app can access. This means that you will be able to download any file you have access on DNAnexus using the
dx download
command.To download a file named
my-file.txt
from the parent project:$ dx download my-file.txt
To download one set of reads from the SRR100022 exome from the public Demo Data project:
$ dx download project-BQbJpBj0bvygyQxgQ1800Jkk:/SRR100022/SRR100022_1.filt.fastq.gz
To navigate to another project you have access to, other than the parent project and download a file from that project you can do the following:
$ dx select --level=VIEW
Available projects (VIEW or higher):
0) Working Project (CONTRIBUTE)
1) Research Project (VIEW)
2) Production Project (VIEW)
[...]
Pick a numbered choice or "m" for more options [0]: 1
Setting current project to: Research Project
$ dx ls
my-file-1.txt
$ dx download my-file-1.txt
This app has network access, so you will be able to download any tool you may need during your session as you would on a Linux workstation. After downloading your tools, you can use the worker as a general-purpose workstation to manipulate and explore your data as needed.
If you would like to have your tools packaged into your workstation as it is launched, you can customize your own version of the Cloud Workstation Applet.
If you wish to save any files or results from your workstation session, you must upload the files back into the project from which the Cloud Workstation App was launched (the "parent project"). To allow you to do this, the Cloud Workstation App is given CONTRIBUTE access to the parent project.
If you have been navigating around your projects, downloading files, you should use the
--path
option with dx upload
to ensure that the files you created are uploaded to the correct project.$ dx upload --path "$DX_PROJECT_CONTEXT_ID:" <FILE>
To perform a test upload, do the following:
$ dx ls
$ echo "This is a test file" > file_from_workstation.txt
$ dx upload --path "$DX_PROJECT_CONTEXT_ID:" test_file_from_workstation.txt
$ dx ls
You should see the contents of your project change between the first and second invocations of
dx ls
.By default, your workstation will automatically shut down after the maximum session length. However, if you wish to terminate the workstation app before the end of the session, simply use the
dx terminate
command with the job-ID of this instance of the Cloud Workstation App, or terminate the job from the web platform.$ dx terminate $DX_JOB_ID
The contents of your workstation will be destroyed upon termination (either manual termination or after the workstation has run for the maximum session length). Remember to upload any files you wish to save before the end of your session.
By default, the Cloud Workstation App will launch on a
mem1_ssd1_v2_x8
instance type which has 8 cores, 16 GB memory, and 180 GB storage. To run the app on a different instance type, use the --instance-type
flag for dx run
.$ dx run --instance-type mem1_ssd1_v2_x36 --ssh app-cloud_workstation
The Cloud Workstation App is set up to use Ubuntu 20.04.
When connecting to the execution environment, you are using the job's credentials to interact with the DNAnexus API. The job has a limited subset of your user's permissions. By default, jobs running the Cloud Workstation App has VIEW permissions to all projects in which you have VIEW permissions or greater.
The
dx select
command by default hides projects to which you only have VIEW permissions, so you will want to run dx select --level=VIEW
in the execution environment to see those projects.The provided Cloud Workstation App provides the minimum functionality for an interactive workstation.
To make your own version of the applet, you can use
dx-app-wizard
to set up a source code template for your applet. To find the original source code for the app, run dx get app-cloud_workstation
.Some example customizations:
- Specifying different inputs
- Prepackage external utilities for use within the worker
- Change the instance type of the worker
- Change the access permissions
Last modified 23d ago