Connecting to Jobs

As of dx v0.319.2 or later, dx ssh can access a job started without the --allow-ssh flag by adding the client IP address to the list of allowed IP addresses. When using dx run --allow-ssh / --ssh, SSH access is more secure and the default is to allow only connections from your client IP address. To use these features, upgrade to dx v0.319.2 or later with pip3 install --upgrade dxpy.

Connecting to Jobs via SSH

Jobs running on the DNAnexus Platform can be optionally configured to allow SSH connections to the DNAnexus worker executing the job. This can be used to monitor the job underway, to troubleshoot a failed job, or to use the worker as a workstation in the cloud.

By default, jobs only have network access to the DNAnexus API, not the Internet. Outbound access can be configured using network access permissions. Inbound access is restricted to SSH connectivity only, and must be enabled explicitly by the user at run time.

Logging and Reproducibility Guarantees

The DNAnexus Platform is designed to support reproducible analyses. When you run an app with the same inputs, you can generally expect to get the same outputs each time. However, enabling features like network access (other than to the DNAnexus API) or connecting to a job via SSH can affect reproducibility. Once you connect to a job, the platform can no longer guarantee that results are exactly the same if you rerun the analysis.

Setting Up Your Environment for SSH Access

One-time setup of the user's account is required to allow use of SSH connections. Open a command shell and use dx ssh_config to perform this setup.

If you have previously set up a public SSH key on the machine you're using before, use option 0 ("Generate a new SSH key pair using ssh-keygen").
If you have previously set up a public SSH key on the machine you're using, select one of the SSH key pairs listed, or option 0 if you wish to generate a new key pair.
If you have previously set up a public SSH key on another computer, you need to copy the keys into your current environment. The keys are found (by default) in ~/.dnanexus_config as ssh_id and ssh_id.pub.

If you already have a key and you generate a new one, the old one is overwritten. You cannot SSH into jobs from any computer that has an older key.

Connecting to Jobs via SSH

Once you have setup your ssh keypair, you can connect to individual jobs you launch on DNAnexus.

SSH connections to jobs can be established by running dx ssh job-xxxx.

Alternatively, you can launch a job and immediately ssh into that job by running: dx run executable --ssh.

Another good use of ssh features is in debugging failing jobs. The platform allows you to specify conditions under which a failed job should be held for debugging using dx run --debug-on. If the job encounters an error, the worker is not terminated as usual, but remains open for 72 hours, giving the user time to ssh into the worker and debug the underlying issue. See the Debug Hold section below.

Finally, a great use of ssh features on the platform is to launch a Cloud Workstation. For more information about configuring a job to use a worker as a workstation in the cloud, see our tutorial on Cloud Workstations.

Connecting to Spark Cluster Workers via SSH

Running dx ssh job-xxxx connects you to the driver node of a Spark cluster job.

To connect to the worker nodes in a cluster job, first connect via SSH to the driver node which ensures your client IP is in the allowedSSH list of the cluster job. You can then connect to the first cluster worker node by running the following from your local machine (not the driver node).

# Connect directly from your local machine to the first Spark worker node in the cluster
# (i.e., clusterSlaves[0]). Replace "job-xxxx" with your actual job ID.
ssh -i ~/.dnanexus_config/ssh_id dnanexus@$(dx describe --json --verbose job-xxxx \
   | jq -rc .clusterSlaves[0].host)

Setting Up SSH on Windows

If you are using Windows, we highly recommend installing OpenSSH, an open source connectivity tool for remote login with the SSH protocol.

Once OpenSSH is installed, start dx-toolkit's CLI (Start menu | DNAnexus CLI (folder) | DNAnexus CLI (shortcut)), then run dx login or dx login --token <token> (if you have a token created from the profile settings via the web interface previously) to log into the platform. After logging in, run dx ssh_config and choose 0 to generate a new SSH key pair. This creates the private key ssh_id file and the public key ssh_id.pub file under the .dnanexus_config folder within your home directory.

You should then be able to follow the instructions above.

Using PuTTY

If you would like to use PuTTY as your ssh terminal, you first need to generate your public/private key pair as described above. Once that is done, you can do the following:

Use PuTTYgen to import your private key. You can do this by:
1. Clicking on "Load" in PuTTYgen, changing the file type from "PuTTY Private Key Files" to "All Files", and selecting the ssh_id file found in the .dnanexus_config folder inside of your Windows User folder.
2. Then click on "Save private key" and save as ssh_id.ppk.
Make sure allowedSSH field of your job includes your client IP address, as seen by the ssh daemon on the worker.
1. If you are not using ssh proxy, you can find your client IP by running dx api system whoami '{"fields":{"clientIp":true}}' |jq .clientIp
2. If you are using SSH proxy, the client IP address as seen by the SSH daemon on the worker is the external IP address of the SSH proxy.
3. If the output of the command: dx describe job-xxxx --json |jq .allowSSH does not contain your client IP address, add it to the job's allowSSH field with dx api job-xxxx update '{"allowSSH":["<clientIP>"]}'
To get the URL of the job you want to SSH into, run dx describe job-xxxx and look for the URL in the host field.
In PuTTY:
1. Place the URL found above in the "Host Name" field.
2. In Connection > Data, place dnanexus in the "Auto-login username" field.
3. In Connection > SSH > Auth, click "Browse" to select your ssh_id.ppk file you generated in Step 1.
4. In Window > Translation > Remote character set: change the setting from UTF-8 to ISO-8859-1: 1998 (Latin-1, West Europe).
5. In Terminal setting, uncheck the first setting "Auto wrap mode initially on" checkbox.
6. In Terminal > Keyboard > The Function keys and keypad, check Xterm R6 radio box.
7. Save the session by giving a name of your choice in Session->Saved Sessions field, then click on the "Save" button on the right.
Click the Open button at the bottom of the screen to login to your job.

Setting Up SSH Access via Proxy

If you would like to use a proxy during SSH connection, run dx ssh --ssh-proxy <proxy_address>:<proxy_port> or dx run --ssh --ssh-proxy <proxy_address>:<proxy_port>. For dx run with SSH access, you must specify both --ssh and --ssh-proxy at runtime. If you do not specify the proxy with dx run --ssh, you must exit out of the current SSH session and start a new session using dx ssh --ssh-proxy <proxy_address>:<proxy_port> to access the job via proxy.

API Details

The following details describe the API functionality used by the high-level dx commands described above.

SSH connection authentication is supported using SSH keys only. To configure a user's account to allow SSH connections to jobs, the sshPublicKey field is passed to /user-xxxx/update. This field is returned in /user-xxxx/describe only when called by that user. Other users cannot see the user's SSH public key.

To configure a job to allow SSH connections, the allowSSH field is passed to the /app-xxxx/run, /applet-xxxx/run, /workflow-xxxx/run, or /job-xxxx/update method. A user can connect to a job via SSH only when all of the following conditions are satisfied:

The job was launched by that user.
The job's allowSSH field is set and includes a hostmask that matches the IP address from which the connection is being made.
The SSH client uses a private key that matches the public key stored in their user account.
The job was started by running an applet, an open-source app, or a non-open-source app of which the user is listed as a developer.

SSH connections to non-open source apps are not supported unless the user is listed as a developer of the app.

Once connected, the terminal uses a byobu or a tmux terminal window manager, allowing the terminal to persist even if you get disconnected. Further information on how to use the terminal is presented in a banner when you log in.

If a job is terminated while you are logged in via SSH, your connection is terminated as well.

Debug Hold

Jobs can be optionally configured to hold the execution environment for debugging when certain types of errors happen (debug hold). When jobs are held for debugging, the user can log in to their execution environment using dx ssh (as long as it has been configured via dx ssh_config).

When entering debug hold, the job transitions into the debug_hold state, and any extra information about the failure reason is set in its debug.failureReason and debug.failureMessage fields. Jobs in this state are held for up to 2 days, and terminated by the system afterwards. Jobs in debug hold can also be terminated by the user either using the API via dx terminate, or by terminating the process tree in the job's execution environment. In all cases, the job can only transition into the terminated or failed state.

When a job's debug.debugOn field contains one of the eligible failure reasons (AppError, AppInternalError, or ExecutionError, typically specified via dx run --debug-on All), the job can enter debug hold via the following two conditions:

The job can quit with a non-zero exit status. In this case the job's original process tree is replaced with a sentinel process.
The job can create the file named /.dx-hold. This induces the execution environment to transition into debug hold without altering the job's process tree, allowing any running processes to remain intact.

Last updated 5 days ago

Was this helpful?