# Cloud Workstation

## Why Use Cloud Workstation

[Cloud Workstation](https://platform.dnanexus.com/app/cloud_workstation) sets up a virtual workstation that lets you access and work with data stored on the DNAnexus Platform, like you would on a local Linux machine, but without having to download files to your computer.

You can configure this workstation to use any of the [wide range of powerful instance types available on the Platform](https://documentation.dnanexus.com/developer/api/running-analyses/instance-types). You can save files or analysis results from your workstation session, to the project within which you launched it. You can also create a snapshot of the session, if you want to pick up work later.

## Launching Cloud Workstation

### Before You Begin

You need the `dx` command-line client to access the virtual workstation. [Download and install it](https://documentation.dnanexus.com/downloads) if you haven't already done so.

You also need to [set up SSH](https://documentation.dnanexus.com/apps/execution-environment/connecting-to-jobs#setting-up-your-environment-for-ssh-access) if you haven't already done so.

### Launching the App

From the command line, use the `dx` client to [navigate to the project](https://documentation.dnanexus.com/user/projects/project-navigation#changing-directly-to-a-specific-project) you'd like to work in.

{% hint style="info" %}
To run the Cloud Workstation app from within a project, you must have either CONTRIBUTE or ADMINISTER access to that project.
{% endhint %}

Next, use `dx run` to launch Cloud Workstation, taking care to add the `--ssh` flag:

```shell
dx run app-cloud_workstation --ssh
```

Once the app launches, you see a list of optional parameters:

```
Select an optional parameter to set by its # (^D or <ENTER> to finish):

 [0] Maximum Session Length (suffixes allowed: s, m, h, d, w, M, y) (max_session_length) [default="1h"]
 [1] Files (fids)
 [2] Snapshot (snapshot)
```

See the [in-product app documentation for more on setting these parameters](https://platform.dnanexus.com/app/cloud_workstation).

Once you've either set or skipped the optional parameters, you are connected to the worker running Cloud Workstation. You see the following in your terminal:

```
Calling app-cloud_workstation with output destination
project-xxxx:/

Job ID: job-xxxx
Waiting for job-xxxx to start......
Resolving job hostname and SSH host key...........................
Checking connectivity to ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com...OK
Connecting to ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com

Welcome to DNAnexus!

This is the DNAnexus Execution Environment, running job-xxxx.
Job: Cloud Workstation
App: cloud_workstation:main
Instance type: mem1_ssd1_v2_x8
Project: emiai (project-xxxx)
Workspace: container-xxxx
Running since: Mon Nov  1 17:40:51 UTC 2024
Running for: 0:00:38
The public address of this instance is ec2-xx-xxx-xxx-xxx.compute-1.amazonaws.com.
You are running byobu, a terminal session manager.
If you get disconnected from this instance, you can log in again; your work will be saved as long as the job is running.
For more information on byobu, press F1.
The job is running in terminal 1. To switch to it, use the F4 key (fn+F4 on Macs; press F4 again to switch back to this terminal).
Use sudo to run administrative commands.
From this window, you can:
 - Use the DNAnexus API with dx
 - Monitor processes on the worker with htop
 - Install packages with apt-get install or pip3 install
 - Use this instance as a general-purpose Linux workstation
OS version: Ubuntu 24.04.2 LTS (GNU/Linux 5.15.0-1076-aws x86_64)
dnanexus@job-xxxx:~$
```

### Preparing Your Workstation

To access a file within your virtual workstation, you must download it from a Platform project to the workstation. If you want to save a file from your workstation session, you must upload it to the workstation's parent project.

To do either, you must prepare your workstation by running the following two commands:

```shell
unset DX_WORKSPACE_ID
dx cd $DX_PROJECT_CONTEXT_ID:
```

The first command unsets an environment variable that's set when the workstation is launched. This allows you to navigate from within the workstation to any of the projects to which you have access. The second command invokes [`dx cd`](https://documentation.dnanexus.com/user/helpstrings-of-sdk-command-line-utilities#select) to change your workstation's working directory, to that of the parent project.

{% hint style="info" %}
For more information about these and other environment variables used within the execution environment, see [Execution Environment Reference](https://documentation.dnanexus.com/apps/execution-environment#environment-variables-in-the-container).
{% endhint %}

Your virtual workstation is ready to use.

## Accessing Files

To access a file within your virtual workstation, you must download it to the workstation, using the [`dx download`](https://documentation.dnanexus.com/user/helpstrings-of-sdk-command-line-utilities#download) command.

This includes files stored in the workstation's parent project.

You can download any file from any project to which you have access.

To download a file named `my-file.txt` from the parent project, use the command:

```shell
dx download my-file.txt
```

To download a set of reads from the SRR100022 exome from the public [Demo Data project](https://platform.dnanexus.com/panx/projects/BQbJpBj0bvygyQxgQ1800Jkk/data/):

```shell
dx download project-BQbJpBj0bvygyQxgQ1800Jkk:/SRR100022/SRR100022_1.filt.fastq.gz
```

When downloading a file or files from a project other than the parent project, you might want to avoid having to enter the project ID. To do this, start by getting a list of projects to which you have access. Then choose a project:

```shell
$ dx select --level=VIEW

Available projects (VIEW or higher):
0) Working Project (CONTRIBUTE)
1) Research Project (VIEW)
2) Production Project (VIEW)

Pick a numbered choice or "m" for more options [0]: 1
Setting current project to: Research Project
```

You can then download a file to your virtual workstation, by entering only the filename:

```shell
dx download my-file-1.txt
```

## Getting Additional Tools

Your virtual workstation has network access, so you can download and use any tool you need during your session, like you would on a Linux workstation.

If you would like a tool or tools to be available within your workstation when it launches, you can [customize the Cloud Workstation app to enable this](#customizing-cloud-workstation).

## Saving Files

If you want to save any files from your workstation session, you must upload these files to the workstation's parent project, using [`dx upload`](https://documentation.dnanexus.com/user/helpstrings-of-sdk-command-line-utilities#upload).

If you changed project context while downloading files to your virtual workstation, you must use the `--path` flag with [`dx upload`](https://documentation.dnanexus.com/user/helpstrings-of-sdk-command-line-utilities#upload) to ensure files are uploaded to the correct project:

```shell
dx upload --path "$DX_PROJECT_CONTEXT_ID:" <FILE>
```

Use the following commands to perform a test upload:

```shell
dx ls
echo "This is a test file" > file_from_workstation.txt
dx upload --path "$DX_PROJECT_CONTEXT_ID:" test_file_from_workstation.txt
dx ls
```

You should see the contents of your project change, with the new file `test_file_from_workstation.txt` appearing in the file list, between the first and second invocations of `dx ls`.

## Terminating the Session

By default, your virtual workstation automatically shuts down once the maximum session length is reached. If you want to shut it down earlier, use the [`dx terminate`](https://documentation.dnanexus.com/user/helpstrings-of-sdk-command-line-utilities#terminate) command, taking care to use `$DX_JOB_ID` to include the workstation's job ID:

```shell
dx terminate $DX_JOB_ID
```

You can also shut down the workstation by finding the job in your project's **Monitor** tab, and clicking **Terminate** at the right end of the row displaying info on the job.

{% hint style="info" %}
The contents of your virtual workstation are destroyed on termination. Before the end of your session, be sure to upload any files you want to save.
{% endhint %}

## Execution Environment

### Instance Type

By default, your virtual workstation launches on a `mem1_ssd1_v2_x8` instance type, which has 8 cores, 16 GB of memory, and 180 GB of storage. To run the app on a different instance type, use the `--instance-type` flag with `dx run`, as in:

```shell
dx run \
  --instance-type mem1_ssd1_v2_x36 \
  --ssh app-cloud_workstation
```

See the [Instance Types page](https://documentation.dnanexus.com/developer/api/running-analyses/instance-types) for a full list of available instance types.

### Operating System

The Cloud Workstation app is set up to use Ubuntu 24.04.

### Job Execution Environment vs. Local Environment

When connecting to the execution environment, you are using the job's credentials to interact with the DNAnexus API. The job has a limited subset of your Platform user permissions. By default, a job running the Cloud Workstation app has VIEW permissions to all projects to which you have VIEW or greater permissions.

By default, the [`dx select`](https://documentation.dnanexus.com/user/helpstrings-of-sdk-command-line-utilities#select) command hides projects to which you have only VIEW permissions. To see a list of those projects, use the command `dx select --level=VIEW`.

## Customizing Cloud Workstation

The Cloud Workstation app provides the minimum functionality necessary to support an interactive workstation.

To make your own version with enhanced functionality, you can create a custom applet, based on the Cloud Workstation app. To get the original source code for the Cloud Workstation app, run `dx get app-cloud_workstation`. See [Introduction to Building Apps](https://documentation.dnanexus.com/developer/apps/intro-to-building-apps) to learn how to build a custom applet that incorporates an existing executable.

Possible customizations include:

* Specifying different inputs
* Prepackaging external utilities for use within the worker
* Changing the instance type of the worker
* Changing access permissions
