Pysam
This applet performs a SAMtools count on an input BAM using Pysam, a python wrapper for SAMtools.
View full source code on GitHub
How is Pysam provided?
Pysam is provided through a pip3 install using the pip3 package manager in the dxapp.json's runSpec.execDepends property:
{
"runSpec": {
...
"execDepends": [
{"name": "pysam",
"package_manager": "pip3",
"version": "0.15.4"
}
]
...
}The execDepends value is a JSON array of dependencies to resolve before the applet source code is run. In this applet, pip3 is specified as the package manager and pysam version 0.15.4 as the dependency to resolve.
Downloading Input
The fields mappings_sorted_bam and mappings_sorted_bai are passed to the main function as parameters for the job. These parameters are dictionary objects with key-value pair {"$dnanexus_link": "<file>-<xxxx>"}. File objects from the platform are handled through DXFile handles. If an index file is not supplied, then a *.bai index is created.
Working with Pysam
Pysam provides key methods that mimic SAMtools commands. In this applet example, the focus is only on canonical chromosomes. The Pysam object representation of a BAM file is pysam.AlignmentFile.
The helper function get_chr
Once a list of canonical chromosomes is established, you can iterate over them and perform the Pysam version of samtools view -c, pysam.AlignmentFile.count.
Uploading Outputs
The summarized counts are returned as the job output. The dx-toolkit Python SDK function dxpy.upload_local_file uploads and generates a DXFile corresponding to the tabulated result file.
Python job outputs have to be a dictionary of key-value pairs, with the keys being job output names as defined in the dxapp.json file and the values being the output values for corresponding output classes. For files, the output type is a DXLink. The dxpy.dxlink function generates the appropriate DXLink value.
Last updated
Was this helpful?