Introduction to Building Apps
Learn to build a custom applet and run it on the DNAnexus Platform. Optionally, convert your applet to an app so it can be run by other users, in their own projects.
Last updated
Learn to build a custom applet and run it on the DNAnexus Platform. Optionally, convert your applet to an app so it can be run by other users, in their own projects.
Last updated
Copyright 2024 DNAnexus
Applets and apps are types of executables that can be run on the DNAnexus Platform. They differ in several ways, notably in the context in which each can be used:
Applets are data objects, which live inside Platform projects.
Apps do not live inside projects, and can be published to allow other users to run them in projects of their choosing.
Applets and apps are created in the same way, up until the final build step. At this step, the developer specifies whether the executable in question should be an applet or an app. An applet can also be converted to an app later, by following these instructions.
For more on the difference between applets and apps, see this detailed comparison of their respective features.
In this tutorial, you’ll learn to create an applet based on an existing executable: FastQTrimmer, one of the FASTX-Toolkit collection of command-line tools for processing short-reads FASTA and FASTQ files. You’ll then use the applet to run FastQTrimmer on a FASTQ file, creating a trimmed reads file that you can then use for further analysis.
Figure 1 shows how you could run FastQTrimmer on your local machine, to process a sequence file in a project on the Platform. As you’ll note, you would need to 1) use dx download
to download the source file to your local machine, then 2) process it using the fastq_quality_trimmer
executable (i.e. FastQTrimmer), 3) use dx upload
to upload the new trimmed reads file to 4) a project on the Platform.
By turning FastQTrimmer into an applet, you make this process much simpler and quicker. You don’t have to download or upload anything, and you can take advantage of the power of the Platform, in running FastQTrimmer.
As shown in Figure 2, you’ll use two DNAnexus dx
utilities in the course of creating your applet: 1) dx-app-wizard
creates a skeleton directory for the applet, while dx build
2) adds the applet to the Platform as 3) a data object in your project.
Before beginning this tutorial, download and install dx-toolkit
. If you haven’t already done so, you may also want to run through the Command Line Quickstart. You should also make sure that you’re logged into the DNAnexus Platform, ideally using an API token, to prevent your being logged off before you’ve finished building your applet.
Begin by downloading both:
A sample FASTQ file containing the first 25,000 reads from a C. elegans sample (SRR070372).
Next, you need to create a local directory and a source code template for your applet. While you can do this manually, the App Wizard enables you to do so via a guided workflow, in a few easy steps. Following is a walkthrough of this workflow, along with detail on how to respond to prompts from the Wizard:
Launch the App Wizard from the CLI by entering the command dx-app-wizard
Enter “mytrimmer” as the name of your applet.
Optionally, enter a title - this is the name of your applet, as displayed in the product UI.
Optionally, enter a summary - a short description of what your applet does.
Enter a version number for your applet, or press <Enter>
to accept the default value of “0.0.1.”
Enter a name - such as "input_file" - for your applet’s 1st input parameter.
Optionally, enter a human-readable label for the 1st input parameter.
Select “file” from the list of input parameter class types displayed.
Enter “n” to indicate that this parameter is not optional.
Rather than entering details on a 2nd input parameter, press <Enter>
to finish entering input parameter details.
Enter a name - such as "output_file" - for the output parameter your applet will produce.
Optionally, enter a human-readable label for the output file.
Select “file” from the list of output parameter class types displayed.
At the prompt, rather than entering details on a 2nd output parameter, press <Enter>
to finish entering output parameter details.
Set a timeout policy value. This is the maximum amount of time your applet is allowed to run before timing out. Press <Enter>
if you want to accept the default value of 48 hours.
Set “bash” as the programming language for your applet.
For each of the remaining questions about template options, access permissions, and Instance types, press <Enter>
to accept the defaults.
The App Wizard will finish by creating a local directory called mytrimmer
.
Here’s how this will all look, from the CLI:
The DNAnexus Platform runs applets on a Linux VM with a stock Ubuntu 20.04 environment. When run, your applet will in turn run an executable - the fastq_quality_trimmer
file you downloaded in Step 1. This executable is not available on the VM by default. To make it available, enter the following commands, which will create a directory on the VM, then copy the fastq_quality_trimmer
file into that directory from your local machine:
$ mkdir -p mytrimmer/resources/usr/bin/
$ cp /path/to/fastq_quality_trimmer mytrimmer/resources/usr/bin/
Note that in the second command, you’ll need to provide the path to the fastq_quality_trimmer
file on your local machine, substituting this for /path/to/
.
Once the fastq_quality_trimmer
file is in the directory mytrimmer/resources/usr/bin/
, it can be accessed by dx build
, which you’ll use to build your applet, as detailed in Step 5 below. dx build
will then package the executable, along with any other files stored in the mytrimmer/resources
directory, as part of your applet.
In the main mytrimmer
directory, you’ll see a file named dxapp.json
. Open dxapp.json
in a text editor. You’ll see that the runSpec
block contains specs for both the interpreter to be used, and the name of the program to be run:
"interpreter": "bash",
"file": "src/mytrimmer.sh"
Close the file. Navigate to the src
directory and open the mytrimmer.sh
file in a text editor. You’ll see that in the main()
block, some of the code has been filled in for you.
Edit the code in the main()
block to incorporate the line that will run your executable. See the code line beginning with fastq_quality_trimmer -t 20
in the code block below. Note that some of the boilerplate comments have been omitted for brevity’s sake.
Next you’ll build the applet using dx build
.
Select the project in which you want to use the applet:
Enter the command dx select
Enter the number corresponding to the project in which you want to use the applet
Now make sure you’re in the directory inside of which you created the mytrimmer
directory. From that directory, enter the command:
$ dx build mytrimmer
Note that you can run dx build
from within the mytrimmer
directory if you prefer, If you do so, omit the directory name from the command:
$ dx build
Once dx build
completes, you’ll see a confirmation message displaying the unique id assigned by the Platform to your new applet. It will look like this:
{"id": "applet-G7GFz9805XQPKQj14ZqX9Vq3"}
Your applet will now appear as a data object in your project. To see it, enter the command:
$ dx ls
To get more info on your applet, enter the command:
$ dx describe mytrimmer
You'll see a description that looks like the following, with the fastq_quality_trimmer
executable shown using its Platform ID, in the bundledDepends
section:
Before you run your applet using the sample input file you downloaded in Step 1, you must upload that file to the Platform.
Navigate to the local directory to which you downloaded the small-celegans-sample.fastq
file. Upload it to the Platform using the command:
$ dx upload small-celegans-sample.fastq
The file will appear in your project, as you’ll see by entering the command:
$ dx ls
You are now ready to launch the analysis in the cloud, using 4) the dx run
command. When you launch the analysis, the Platform will bring up 5) a new Linux VM to run your code.
Now launch the applet by entering the command:
$ dx run mytrimmer -iinput_file=small-celegans-sample.fastq
You’ll see a prompt asking you to confirm that you want to run the job with the input you designated. Enter “Y.”
You’ll see a confirmation that includes a Job ID, and a prompt asking if you want to watch, or monitor, your job’s progress:
Calling applet-G7GFz9805XQPKQj14ZqX9Vq3 with output destination project-G7FbxV805XQ0k10vKbG474p9:/
Job ID: job-G7GG1f005XQ350gFB9VY0Kb
Watch launched job now? [Y/n]
Enter “Y” if you’d like to monitor your job. You’ll see a log file giving detail on every step of the job's progress.
When the job has finished, enter the command dx ls
to view the files in your project. This list will now include the output file generated by your applet.
Enter the command dx get
to retrieve the output file:
$ dx get output_file
To see the first ten lines of the output file, enter the command:
$ head output_file
This excerpt of the file should look something like this, and thus should show that your applet worked correctly:
@SRR070372.1 FV5358E02GLGSF length=78 TTTTTTTTTTTTTTTTTTTTTTTTTTTNTTTNTTTNTTTNTTTATTTATTTATTTATTATTATATATATATATATA +SRR070372.1 FV5358E02GLGSF length=78 ...000//////999999<<<=<<666!602!777!922!688:669A9=<=122569AAA?>@BBBBAA?=<966 @SRR070372.2 FV5358E02FQJUJ length=177 TTTCTTGTAATTTGTTGGAATACGAGAACATCGTCAATAATATATCGTATGAATTGAACCACACGGCACATATTTGAACTTGTTCGTGAAATTTAGCGAACCTGGCAGGACTCGAACCTCCAATCTTCGGATCCGAAGTCCGACGCCCCCGCGTCGGATGCGTTGTTACCACTGCTT +SRR070372.2 FV5358E02FQJUJ length=177 222@99912088>C<?7779@<GIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIC;6666IIIIIIIIIIII;;;HHIIE>944=>=;22499;CIIIIIIIIIIIIHHHIIIIIIIIIIIIIIIH?;;;?IIEEEEEEEEIIII77777I7EEIIEEHHHHHIIIIIIIIIIIIII @SRR070372.3 FV5358E02GYL4S length=70 TTGGTATCATTGATATTCATTCTGGAGAACGATGGAACATACAAGAATTGTGTTAAGACCTGCATAA£
You can also run a program like seqmagick to verify that the sequences have been trimmed.
Figure 4 gives an overview of how your applet is run. Once the Platform has instantiated a Linux VM, it runs your applet, executing the shell script commands you provided. The script runs just as it would on your local computer, 6) downloading the reads to the hard drive of the virtual machine, 7) running FASTX-Toolkit, then 8) uploading the resulting file to 9) your project.
As noted above, you can convert your applet to an app, to enable others to use it in their own projects. Follow these directions to convert it to an app.
If you wish to change the inputs or outputs of your applet, or request additional execution resources - adding network access or more CPU or memory, for example - edit the file mytrimmer/dxapp.json
and re-run dx build. See the Advanced Applet Tutorial for a detailed overview of the dxapp.json
file, and how to edit it.
When running dx-app-wizard
, you selected the "basic" execution template. This means that your applet will run on a single machine. You can use the wizard’s --template
option to set more advanced execution options:
basic: Your applet or app will run on a single machine.
parallelized: Your applet or app will subdivide a large chunk of work into multiple pieces that can be processed in parallel and independently of each other, followed by a final stage that will merge and process the results as necessary.
scatter-process-gather: Similar to parallelized but with the addition of a "scatter" entry point. This allows you to break out the execution for splitting up the input, or you can call a separate applet or app to perform the splitting.
Try the other available templates to see simple examples of how to parallelize your execution over multiple machines in the cloud, by using additional entry points. You can also use other programming languages, leveraging DNAnexus client libraries. While the dx
client provides a wide range of advanced functionality, client libraries can provide a richer experience for programmatically accessing and modifying data on the Platform, in the programming language of your choice.
See the Advanced Applet Tutorial to get a better understanding of the app directory structure and how to manually modify app inputs, outputs, and metadata.
See the Job Lifecycle page for detail on the progression of a job's states and discusses the reasons a job may fail.