Upload Agent
Introduction
The DNAnexus Upload Agent is a fast and convenient command-line client that can be used to upload files to DNAnexus. For uploading multiple or large files, Upload Agent is particularly recommended due to its ability to resume previously interrupted uploads.
Installing Upload Agent: Follow the instructions on the Upload Agent download page to download the Upload Agent executable.
For the rest of this document, ua represents the Upload Agent executable, but you should replace it with the path to where you have saved the Upload Agent executable on your local file system.
Basic Usage
Synopsis
./ua [options] [...]Usage
The following examples assume that you have set your environment variables, specifically, the authentication token and current workspace (project).
You can always override these environment variables by using the --auth-token and --project command-line options.
To see the current environment variables being used by ua, run:
$ ./ua --env
API server protocol: https
API server host: api.dnanexus.com
API server port: ---
Auth token: <TOKEN>
Current Project: my_project (project-xxxx)Running a Diagnostic Test
Running the Upload Agent with the --test flag runs a test to verify that ua is correctly configured. The output of a successful configuration looks similar to the output below. Upload Agent prints any errors as part of the output.
Uploading a Single File
You can upload a single file using the Upload Agent. The following example shows how to upload a local file named my_file.txt to the project called my_project.
By default, uncompressed files are automatically compressed during upload, so when you view the uploaded file it is named my-file.txt.gz.
Uploading Multiple Files to the Same Project
You can upload multiple files to the same project. In the following example, two local files, my_file_1.txt and my_file_2.txt, are uploaded to the project my_project. By default, uncompressed files are automatically compressed during upload.
File IDs output in the same command-line input order. In the above example, the first and second lines correspond to the new file IDs generated by uploading my_file_1.txt and my_file_2.txt, respectively.
Uploading Directories
You can upload all the files in a given directory. By default, uncompressed files are automatically compressed during upload.
The destination of the files depends on the directory name given as input. If the name contains a trailing /, Upload Agent doesn't create the directory, it copies the contents of the folder to the destination path in the platform.
Without a trailing /, a new remote directory is created and the files are uploaded to the new directory (dir_name).
You can upload at most 1000 files in a single operation.
Uploading Directories Recursively
You can upload a directory recursively using the --recursive flag. The destination directory follows the same rules as above. With a trailing /, ua assumes that the destination directory exists. Without the trailing /, Upload Agent creates a new directory, if the directory doesn't exist.
Uploading Data from stdin
stdinYou can upload data from stdin directly into a file by using the --read-from-stdin flag. With this flag, you can upload only a single file. This can be useful when you need to pipe output from a program and upload it as a file.
This command reads data interactively from the terminal until the stream is terminated with <CTRL>+D, which represents the end of the file (EOF).
Redirecting Uploaded Files
Redirecting to a Folder
You can change the final path of the file in the project via the flags --folder and --name. The following command uploads my_file_1.txt into the folder called oldData and renames it to file_1. Due to the default automatic compression, the final filename on the Platform is file_1.gz.
Automatic Compression
By default, Upload Agent automatically compresses uncompressed files before uploading them and appends .gz to the filename. This compression improves upload efficiency and reduces storage costs for text-based files, such as FASTA, FASTQ, and CSV files.
Disabling Compression
To upload files without compression, use the --do-not-compress flag. This preserves the original filename and content without any modification.
Preventing the Resumption of Previous Uploads
By default, Upload Agent attempts to resume all the uploads it can. In the case, where you would like to upload the same file twice, you can override this behavior with the --do-not-resume flag.
In the situation where the Upload Agent fails to upload a file, or has partially uploaded a file, resume the upload by specifying the same command again. When resuming an upload, a file signature is generated using the following information:
sizemodifiedTimestamptoCompress(boolean whether the file was uploaded original with --do-not-compress)chunkSizethe canonical path to the file
This information is summarized as a metadata field on the file object. When you upload a file using Upload Agent, it quickly calculates this file signature and searches your current project for any file with the same signature. If it finds such an object, and if the file upload is incomplete, it tries to resume the upload. If the file upload is complete, then the file signature is added as a property.
Waiting for a File to Close
When scripting, the ua command can wait until uploaded files are in the closed state before proceeding to the next command by using the --wait-on-close flag. You do not have to wait for a file to be closed to give it as input to app or applet, as the platform automatically waits for the file to be closed before starting the job. However, if you would like to copy a file between projects, then you must wait for it to be in the closed state.
Monitoring Upload Progress
You can turn on progress reporting (printed to stderr) with the --progress flag.
Uploading Files With Metadata
Details
Assigning File Details
Upload Agent can set details for a file using the --details flag. The details must be passed as a valid JSON string. For more information about JSON, see the Wikipedia page on JSON.
Assigning Details to Multiple Files
The following command sets the same details to all the files being uploaded.
Assigning Different Details to Multiple Files
Properties
Upload Agent can assign properties to a file during upload using the --property flag.
Assigning a Property to a Single File
Assigning Multiple Properties to a Single File
Advanced Usage
Changing the Number of Threads
You can specify a different number of threads for compression and a different number of outgoing HTTPS connections to be opened to upload the file chunks by using the flags --compress-threads and --upload-threads, respectively. The number of threads used to read the input files can be changed by the --read-threads flag.
For example, if you are uploading some files from a eight-core machine, we recommend that you limit the usage to 75% of the machine's capabilities as a safety measure and evenly divide the usage amongst the three options. As a result, the number of cores for reading the input data (--read-threads), compressing (--compress-threads) and uploading (--upload-threads) the files would be two each. The command would look something like this:
Using a Different Chunk Size
You can change the chunk size that is uploaded at a time in each thread using the flag --chunk-size. This parameter depends on the memory available on the machine. We recommend that you keep the default value. However, if your network connection is particularly slow, use a smaller chunk size.
The following command splits up large-file.txt into chunks of size 200MB (209,715,200 bytes) each to be uploaded. By default, the chunk size is ~95MB (100,000,000 bytes). Upload Agent has a maximum limit of 10,000 chunks.
Setting Files as Hidden
By default, Upload Agent sets all files as visible. You can override this behavior with the --visibility flag.
Help String
Specification
Output
On successful completion, the file IDs of the newly created remote files are printed to standard output (each on a new line). If a particular file upload was unsuccessful, then the string "Failed" is printed instead of the file ID. The lines are printed in same order as the files specified on command line for upload.
Errors
In case an error occurs, Upload Agent does not exit immediately. Instead, all other files are still uploaded and the program exits with a non-zero status code, printing "Failed" instead of the file ID of the failed uploads.
Non-Zero Error Code
The program exits with a non-zero error code if any of the following errors occur:
A valid authentication token was not provided.
A connection to the API server could not be made.
A file to be uploaded does not exist or is not accessible.
If
--do-not-resumeis not set and the user tries to upload the same file to a project more than once.An unknown command line option or illegal value for an option is provided.
The project is not specified, the specified project does not exist, or the authentication token provided does not allow CONTRIBUTE access to the specified project.
The project specifier cannot be unambiguously resolved, for example, if two or more projects match the given project name.
A folder or file object could not be created.
A file could not be closed. This occurs when the /file-xxxx/close API call fails.
An error occurs while compressing a chunk, for example, the machine ran out of memory.
File Not Fully Uploaded
A file may not be fully uploaded if any of the following errors occur:
If the same local file has been uploaded to a project more than once (either partially or fully) and
--do-not-resumeis not set, Upload Agent may not be able to determine which remote file to resume. In this case, the upload may not complete.A chunk fails to upload after the specified number of retry attempts.
A file could not be closed because one of the chunks was compressed below the 5MB limit. In this case, you should try uploading the failed file with either the
--do-not-compressoption, or by setting a larger--chunk-size.
Last updated
Was this helpful?