# Connecting to Jobs

{% hint style="info" %}
As of `dx` v0.319.2 or later, `dx ssh` can access a job started without the `--allow-ssh` flag by adding the client IP address to the list of allowed IP addresses. When using `dx` run `--allow-ssh` / `--ssh`, SSH access is more secure and the default is to allow only connections from your client IP address. To use these features, upgrade to `dx` v0.319.2 or later with ​​`pip3 install --upgrade dxpy`.
{% endhint %}

## Connecting to Jobs via SSH

Jobs running on the DNAnexus Platform can be optionally configured to allow SSH connections to the DNAnexus worker executing the job. This can be used to monitor the job underway, to troubleshoot a failed job, or to use the worker as a workstation in the cloud.

By default, jobs only have network access to the DNAnexus API, not the Internet. Outbound access can be configured using [network access permissions](https://documentation.dnanexus.com/developer/apps/app-permissions). Inbound access is restricted to SSH connectivity only, and must be enabled explicitly by the user at run time.

### Logging and Reproducibility Guarantees

The DNAnexus Platform is designed to support reproducible analyses. When you run an app with the same inputs, you can expect to get the same outputs each time. However, enabling features like network access (other than to the DNAnexus API) or connecting to a job via SSH can affect reproducibility. Once you connect to a job, the platform can no longer guarantee that results are exactly the same if you rerun the analysis.

### Setting Up Your Environment for SSH Access

One-time setup of the user's account is required to allow use of SSH connections. Open a command shell and use `dx ssh_config` to perform this setup.

* If you have not previously set up a public SSH key on the machine you are using, use option 0 ("Generate a new SSH key pair using ssh-keygen").
* If you have previously set up a public SSH key on the machine you are using, select one of the listed SSH key pairs, or option 0 if you want to generate a new key pair.
* If you have previously set up a public SSH key on *another* computer, you need to copy the keys into your current environment. The keys are found (by default) in `~/.dnanexus_config` as `ssh_id` and `ssh_id.pub`.

If you already have a key and you generate a new one, the old one is overwritten. You cannot SSH into jobs from any computer that has an older key.

### Connecting to Jobs via SSH

Once you have setup your ssh key pair, you can connect to individual jobs you launch on DNAnexus.

SSH connections to jobs can be established by running `dx ssh job-xxxx`.

Alternatively, you can launch a job and immediately ssh into that job by running: `dx run executable --ssh`.

Another good use of ssh features is in debugging failing jobs. The platform allows you to specify conditions under which a failed job should be held for debugging using `dx run --debug-on`. If the job encounters an error, the worker is not terminated as usual, but remains open for 72 hours, giving the user time to ssh into the worker and debug the underlying issue. See the [Debug Hold](#debug-hold) section below.

Finally, a common use of SSH features on the platform is to launch a Cloud Workstation. For information about configuring a job to use a worker as a workstation in the cloud, see the [tutorial on Cloud Workstations](https://documentation.dnanexus.com/developer/cloud-workstation).

### Connecting to Spark Cluster Workers via SSH

Running `dx ssh job-xxxx` connects you to the driver node of a [Spark cluster](https://documentation.dnanexus.com/developer/apps/developing-spark-apps) job.

To connect to the worker nodes in a cluster job, first connect via SSH to the driver node which ensures your client IP is in the `allowedSSH` list of the cluster job. You can then connect to the first cluster worker node by running the following from your local machine (not the driver node).

```shell
# Connect directly from your local machine to the first Spark worker node in the cluster
# (i.e., clusterSlaves[0]). Replace "job-xxxx" with your actual job ID.
ssh -i ~/.dnanexus_config/ssh_id dnanexus@$(dx describe --json --verbose job-xxxx \
   | jq -rc .clusterSlaves[0].host)
```

### Setting Up SSH on Windows

If you are using Windows, install [OpenSSH](https://learn.microsoft.com/en-us/windows-server/administration/openssh/openssh_install_firstuse), an open source connectivity tool for remote login with the SSH protocol.

Once **OpenSSH** is installed, start dx-toolkit's CLI (`Start menu` | `DNAnexus CLI (folder)` | `DNAnexus CLI (shortcut)`), then run `dx login` or `dx login --token <token>` (if you have a token created from the profile settings via the web interface previously) to log into the platform. After logging in, run `dx ssh_config` and choose `0` to generate a new SSH key pair. This creates the private key `ssh_id` file and the public key `ssh_id.pub` file under the `.dnanexus_config` folder within your home directory.

You should then be able to follow the instructions above.

### Using PuTTY

If you want to use [**PuTTY**](https://www.putty.org/) as your ssh terminal, you first need to generate your public/private key pair as described above. Once that is done, you can do the following:

1. Use PuTTYgen to import your private key. You can do this by:
   1. Clicking on "Load" in PuTTYgen, changing the file type from "PuTTY Private Key Files" to "All Files", and selecting the `ssh_id` file found in the `.dnanexus_config` folder inside of your Windows User folder.
   2. Then click on "Save private key" and save as `ssh_id.ppk`.
2. Make sure the `allowedSSH` field of your job includes your client IP address, as seen by the ssh daemon on the worker.
   1. If you are not using ssh proxy, you can find your client IP by running `dx api system whoami '{"fields":{"clientIp":true}}' |jq .clientIp`
   2. If you are using SSH proxy, the client IP address as seen by the SSH daemon on the worker is the external IP address of the SSH proxy.
   3. If the output of the command:\
      `dx describe job-xxxx --json |jq .allowSSH`\
      does not contain your client IP address, add it to the job's `allowSSH` field with\
      `dx api job-xxxx update '{"allowSSH":["<clientIP>"]}'`
3. To get the URL of the job you want to SSH into, run `dx describe job-xxxx` and look for the URL in the `host` field.
4. In PuTTY:
   1. Place the URL found above in the "Host Name" field.
   2. In Connection > Data, place `dnanexus` in the "Auto-login username" field.
   3. In Connection > SSH > Auth, click "Browse" to select your `ssh_id.ppk` file you generated in Step 1.
   4. In Window > Translation > Remote character set: change the setting from UTF-8 to ISO-8859-1: 1998 (Latin-1, West Europe).
   5. In Terminal setting, deselect the first setting "Auto wrap mode initially on" checkbox.
   6. In Terminal > Keyboard > The Function keys and keypad, check the *Xterm R6* radio box.
   7. Save the session by giving a name of your choice in the Session->Saved Sessions field, then click on the "Save" button on the right.
5. Click the `Open` button at the bottom of the screen to login to your job.

#### Setting Up SSH Access via Proxy

If you want to use a proxy during SSH connection, run `dx ssh --ssh-proxy <proxy_address>:<proxy_port>` or `dx run --ssh --ssh-proxy <proxy_address>:<proxy_port>`. For `dx run` with SSH access, you must specify both `--ssh` and `--ssh-proxy` at runtime. If you do not specify the proxy with `dx run --ssh`, you must exit out of the current SSH session and start a new session using `dx ssh --ssh-proxy <proxy_address>:<proxy_port>` to access the job via proxy.

### API Details

The following details describe the API functionality used by the high-level `dx` commands described above.

SSH connection authentication is supported using SSH keys only. To configure a user's account to allow SSH connections to jobs, the `sshPublicKey` field is passed to [`/user-xxxx/update`](https://documentation.dnanexus.com/api/users#api-method-user-xxxx-update). This field is returned in [`/user-xxxx/describe`](https://documentation.dnanexus.com/api/users#api-method-user-xxxx-describe) only when called by that user. Other users cannot see the user's SSH public key.

To configure a job to allow SSH connections, the `allowSSH` field is passed to the [`/app-xxxx/run`](https://documentation.dnanexus.com/api/running-analyses/apps#api-method-app-xxxx-yyyy-run), [`/applet-xxxx/run`](https://documentation.dnanexus.com/api/running-analyses/applets-and-entry-points#api-method-applet-xxxx-run), [`/workflow-xxxx/run`](https://documentation.dnanexus.com/api/running-analyses/workflows-and-analyses#api-method-workflow-xxxx-run), or [`/job-xxxx/update`](https://documentation.dnanexus.com/api/running-analyses/applets-and-entry-points#api-method-job-xxxx-update) method. A user can connect to a job via SSH only when *all* of the following conditions are satisfied:

* The job was launched by that user.
* The job's `allowSSH` field is set and includes a hostmask that matches the IP address from which the connection is being made.
* The SSH client uses a private key that matches the public key stored in their user account.
* The job was started by running an applet, an open-source app, or a non-open-source app of which the user is listed as a developer.

{% hint style="info" %}
SSH connections to non-open source apps are *not* supported unless the user is listed as a developer of the app.
{% endhint %}

Once connected, the terminal uses a [`byobu`](https://www.byobu.org/) or a [`tmux`](https://tmux.github.io/) terminal window manager, allowing the terminal to persist even if you get disconnected. Further information on how to use the terminal is presented in a banner when you log in.

If a job is terminated while you are logged in via SSH, your connection is terminated as well.

## Debug Hold

Jobs can be optionally configured to hold the execution environment for debugging when certain types of errors happen (debug hold). When jobs are held for debugging, the user can log in to their execution environment using `dx ssh` (if it has been configured via `dx ssh_config`).

When entering debug hold, the job transitions into the `debug_hold` state, and any extra information about the failure reason is set in its `debug.failureReason` and `debug.failureMessage` fields. Jobs in this state are held for up to 2 days, and terminated by the system afterwards. Jobs in debug hold can also be terminated by the user either using the API via `dx terminate`, or by terminating the process tree in the job's execution environment. In all cases, the job can only transition into the `terminated` or `failed` state.

When a job's `debug.debugOn` field contains one of the eligible failure reasons (`AppError`, `AppInternalError`, `AppInsufficientResourceError`, or `ExecutionError`, typically specified via `dx run --debug-on All`), the job can enter debug hold via the following two conditions:

1. The job can quit with a non-zero exit status. In this case the job's original process tree is replaced with a sentinel process.
2. The job can create the file named `/.dx-hold`. This induces the execution environment to transition into debug hold without altering the job's process tree, allowing any running processes to remain intact.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.dnanexus.com/developer/apps/execution-environment/connecting-to-jobs.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
