Last updated
Copyright 2024 DNAnexus
Last updated
dxCompiler is a tool for compiling pipelines written in the and the to equivalent workflows on the DNAnexus platform.
Below we introduce a few use cases for dxCompiler. This tool can be downloaded from the dxCompiler Github , which also contains more including DNAnexus extensions and integration with private docker repositories.
For information on how to set up dxCompiler, see instructions on the .
dxCompiler uses , a parser that adheres strictly to the WDL specifications. Most of the problematic automatic type conversions that are allowed by some other WDL runtime engines are not allowed by dxCompiler. Please use the command line tools in wdlTools (e.g. check
and lint
) to validate your WDL files before trying to compile them with dxCompiler.
The bam_chrom_counter
workflow is written in WDL. Task slice_bam
splits a bam file into an array of sub-files. Task count_bam
counts the number of alignments on a bam file. The workflow takes an input bam file, calls slice_bam
to split it into chromosomes, and calls count_bam
in parallel on each chromosome. The results comprise a bam index file, and an array with the number of reads per chromosome.
From the command line, we can compile the workflow to the DNAnexus platform using the dxCompiler jar file.
This compiles the source WDL file to several platform objects in the specified DNAnexus project project-xxxx
under folder /my/workflows/
A workflow bam_chrom_counter
Two applets that can be called independently: slice_bam
, and count_bam
A few auxiliary applets that process workflow inputs, outputs, and launch the scatter.
You can review the structure of the compiled DNAnexus workflow using the describe
dxCompiler subcommand. The output shows the generated DNAnexus workflows and applets in a tree that describes a caller/callee relationship.
The snapshot below shows what you will see from the UI when the workflow execution is completed:
dxCompiler requires the source CWL file to be "packed" as a cwl.json file, which contains a single compound workflow with all the dependent processes included. Additionally, you may need to upgrade the version of your workflow to CWL v1.2.
We'll use the bam_chrom_counter
CWL workflow similar to the WDL example above to illustrate upgrading, packing and running a CWL workflow. This workflow is written in CWL v1.0 and the top-level Workflow
in bam_chrom_counter.cwl
calls the two CommandLineTool
s in slice_bam.cwl
and count_bam.cwl
.
Before compilation, follow the steps below to preprocess these CWL files:
Install cwl-upgrader
and upgrade the CWL files to v1.2 (needed in this case as CWL files are in CWL v1.0):
Once it is upgraded and packed as suggested above, we can compile it as a DNAnexus workflow and run it.
WDL and CWL
All task and workflow names must be unique across the entire import tree
For example, if A.wdl
imports B.wdl
and A.wdl
defines workflow foo
, then B.wdl
cannot have a workflow or task named foo
Subworkflows built from higher-level workflows are not intended to be used on their own
WDL only
Workflows with forward references (i.e. a variable referenced before it is declared) are not yet supported
The call ... after
syntax introduced in WDL 1.1 is not yet supported
CWL only
Calling native DNAnexus apps/applets in CWL workflow using dxni
is not supported.
SoftwareRequirement
and InplaceUpdateRequirement
are not yet supported
Publishing a dxCompiler-generated workflow as a global workflow is not supported
Applet and job reuse is not supported
dxCompiler's capabilities are outlined in the dxCompiler's help string below.
The generated workflow can be executed from the web UI (see instructions ) or via the DNAnexus command-line client. For example, to run the workflow with the input bam file project-BQbJpBj0bvygyQxgQ1800Jkk:file-FpQKQk00FgkGV3Vb3jJ8xqGV
, use the following command:
Alternatively, you can also convert a into a DNAnexus format when compiling the workflow. Then you can pass the DNAnexus input file to dx run
using -f
option as described in detail .
After launching the workflow analysis, you can monitor it on the CLI following or from the UI as suggested . CLI sessions shows the executed workflow:
De-localize all local paths referenced in the CWL: if the CWL specifies a local path, e.g. a schema or a default value for a file
-type input (like the default path "path/to/my/input_bam" for input bam
in ), you need to upload this file to a DNAnexus project and then replace the local path in the CWL with its full DNAnexus URI, e.g. dx://project-XXX:file-YYY
.
Install sbpack
package and run the cwlpack
command on the top-level workflow file to build a single packed file containing the top level workflow and all the steps it depends on:
dxCompiler compiles tools/workflows written according to the . You can use cwltool --validate
to validate the packed CWL file you want to compile.
Calls with missing arguments have
The that has been deprecated since WDL draft2 is not supported
A detailed description of the advanced dxCompiler features can be found in the public dxCompiler github repository .