Connecting to Jobs
Note: As of dx v0.319.2 or later, dx ssh
is able to access a job started without the --allow-ssh
flag, by adding the client IP address to the list of IP addresses that are allowed to connect to the job. When using dx run --allow-ssh
/ --ssh
, ssh access is now more secure and the default is to allow only connections from your client IP address. To use these features, upgrade to dx v0.319.2 or later with pip3 install --upgrade dxpy
or by following DNAnexus docs.
Connecting to Jobs via SSH
Jobs running on the DNAnexus Platform can be optionally configured to allow SSH connections to the DNAnexus worker executing the job. This can be used to monitor the job underway, to troubleshoot a failed job, or to employ the worker as a workstation in the cloud.
By default, jobs are firewalled from the Internet, and only have network access to the DNAnexus API. Outbound access can be configured using network access permissions. Inbound access is restricted to SSH connectivity only, and must be enabled explicitly by the user at run time.
Logging and Reproducibility Guarantees
A quick note about reproducibility guarantees: the DNAnexus platform provides features in support of high-level reproducibility of analyses. In general, apps given the same inputs produce the same outputs when run again. Features such as network access (other than to the DNAnexus API) are incompatible with this reproducibility support. Please be aware that when you connect to a job, reproducibility can no longer be ensured.
Setting Up Your Environment for SSH Access
One-time setup of the user's account is required to allow use of SSH connections. Open a command shell and use dx ssh_config
to perform this setup.
If you haven't set up a public SSH key on the machine you're using before, use option 0 ("Generate a new SSH key pair using ssh-keygen").
If you have previously set up a public SSH key on the machine you're using, select one of the SSH key pairs listed, or option 0 if you wish to generate a new key pair.
If you have previously set up a public SSH key on another computer, you'll have to copy the keys into your current environment. The keys are found (by default) in
~/.dnanexus_config
asssh_id
andssh_id.pub
.
If you already have a key and you generate a new one, the old one will be overwritten. Note that you won't be able to SSH into jobs from any computer that has an older key.
Connecting to Jobs via SSH
Once you have setup your ssh keypair, you can connect to individual jobs you launch on DNAnexus.
SSH connections to jobs can be established by running dx ssh job-xxxx
.
Alternatively, you can launch a job and immediately ssh into that job by running: dx run executable --ssh
.
Another good use of ssh features is in debugging failing jobs. The platform allows you to specify conditions under which a failed job should be held for debugging using dx run --debug-on
. If the job encounters an error, the worker will not be terminated as usual, but will remain open for 72 hours, giving the user time to ssh into the worker and debug the underlying issue. See the Debug Hold section below.
Finally, a great use of ssh features on the platform is to launch a cloud workstation. For more information about configuring a job to employ a worker as a workstation in the cloud, please see our tutorial on cloud workstations.
Connecting to Spark Cluster Workers via SSH
Running dx ssh job-xxxx
will connect you to the driver node of a Spark cluster job.
To connect to the worker nodes in a cluster job, first connect via SSH to the driver node which will make sure your client IP is in the allowedSSH
list of the cluster job. You can then connect to the first cluster worker node by running the following from your local machine (not the driver node).
ssh -i ~/.dnanexus_config/ssh_id dnanexus@$(dx describe --json --verbose job-xxxx|jq -rc .clusterSlaves[0].host)
where clusterSlaves[0]
refers to the first worker node.
Setting Up SSH on Windows
If you are using Windows, we highly recommend installing OpenSSH, an open source connectivity tool for remote login with the SSH protocol.
Once OpenSSH is installed, start DX CLI ( Start menu | DNAnexus CLI (folder)| DNAnexus CLI (shortcut) ), then run dx login
or dx login --token <token>
(if you have token created from the profile settings via the web inteface previoussly) to log into the platform. After logged in, run dx ssh_config
and choose 0 to generate a new ssh key pair. This will create the private key "ssh_id" file and the public key "ssh_id.pub" file under the ".dnanexus_config" folder within your home.
You should then be able to follow the instructions above.
Using PuTTY
If you would like to use PuTTY as your ssh terminal, you will first need to generate your public/private key pair as described above. Once that is done, you can do the following:
Use PuTTYgen to import your private key. You can do this by:
Clicking on "Load" in PuTTYgen, changing the filetype from "PuTTY Private Key Files" to "All Files", and selecting the
ssh_id
file found in the.dnanexus_config
folder inside of your Windows User folder.Then click on "Save private key" and save as "ssh_id.ppk".
Make sure
allowedSSH
field of your job includes your client IP address, as seen by the ssh daemon on the worker.If you are not using ssh proxy, you can find your client IP by running
dx api system whoami '{"fields":{"clientIp":true}}' |jq .clientIp
If you are using SSH proxy, the client IP address as seen by the SSH daemon on the worker will be external IP address of the SSH proxy.
If the output of the command:
dx describe job-xxxx --json |jq .allowSSH
does not contain your client IP address, add it to the job'sallowSSH
field withdx api job-xxxx update '{"allowSSH":["<clientIP>"]}'
Get the URL of the job you would like to ssh into by running
dx describe job-xxxx
and making note of the url found in thehost
field.In PuTTY:
a. Place the URL found above in the "Host Name" field.
b. In Connection->Data, place
dnanexus
in the "Auto-login username" field.c. In Connection->SSH->Auth, click "Browse" to select your "ssh_id.ppk" file you generated in Step 1.
d. In Window->Translation->Remote character set: change the setting from UTF-8 to ISO-8859-1: 1998 (Latin-1, West Europe).
e. In Terminal setting, uncheck the first setting "Auto wrap mode initially on" checkbox.
f. In Terminal->Keyboard-> The Function keys and keypad, check Xterm R6 radio box.
g. Save the session by giving a name of your choice in Session->Saved Sessions field, then click on the "Save" button on the right.
Click the
Open
button at the bottom of the screen to login to your job.
Setting Up SSH Access via Proxy
If you would like to use a proxy during SSH connection, run dx ssh --ssh-proxy <proxy_address>:<proxy_port>
or dx run --ssh --ssh-proxy <proxy_address>:<proxy_port>
. For dx run
with SSH access, you must specify both --ssh
and --ssh-proxy
at runtime. If you do not specify the proxy with dx run --ssh
, you will have to exit out of the current SSH session and initiate a new session using dx ssh --ssh-proxy <proxy_address>:<proxy_port>
to access the job via proxy.
API Details
The following details describe the API functionality used by the high-level dx
commands described above.
SSH connection authentication is supported using SSH keys only. To configure a user's account to allow SSH connections to jobs, the sshPublicKey
field is passed to /user-xxxx/update
. This field is returned in /user-xxxx/describe
only when called by that user; other users cannot see the user's SSH public key.
To configure a job to allow SSH connections, the allowSSH
field is passed to the /app-xxxx/run
, /applet-xxxx/run
, /workflow-xxxx/run
, or /job-xxxx/update
method. A user can connect to a job via SSH only when all of the following conditions are satisfied:
The job was launched by that user.
The job's
allowSSH
field is set and includes a hostmask that matches the IP address from which the connection is being made.The SSH client uses a private key that matches the public key stored in their user account.
The job was started by running an applet, an open-source app, or a non-open-source app of which the user is listed as a developer.
Note that SSH connections to non-open source apps are not supported, if the user is not listed as a developer of the app in question.
Once connected, the terminal uses a byobu (external link) or a tmux (external link) terminal window manager, allowing the terminal to persist even if you get disconnected. Further information on how to use the terminal is presented in a banner when you log in.
If a job is terminated while you are logged in via SSH, your connection will be terminated as well.
Debug Hold
Jobs can be optionally configured to hold the execution environment for debugging when certain types of errors happen (debug hold). When jobs are held for debugging, the user can log in to their execution environment using dx ssh
(as long as it has been configured via dx ssh_config
).
When entering debug hold, the job transitions into the debug_hold
state, and any extra information about the failure reason is set in its debug.failureReason
and debug.failureMessage
fields. Jobs in this state are held for up to 2 days, and terminated by the system afterwards. Jobs in debug hold can also be terminated by the user either using the API (e.g. via dx terminate
), or by terminating the process tree in the job's execution environment. In all cases, the job can only transition into the terminated
or failed
state.
When a job's debug.debugOn
field contains one of the eligible failure reasons (AppError,
AppInternalError
, or ExecutionError
, typically specified via dx run --debug-on All
), the job can enter debug hold via the following two conditions:
The job can quit with a non-zero exit status. In this case the job's original process tree is replaced with a sentinel process.
The job can create the file named
/.dx-hold
. This will induce the execution environment to transition into debug hold without altering the job's process tree, allowing any running processes to remain intact.
Last updated