# Python Apps

This tutorial shows:

* Writing, deploying, running, and monitoring apps in Python
* Using the DNAnexus Platform APIs to represent and store your data

This tutorial assumes that you have already installed the [DNAnexus SDK](/downloads.md) and worked through [Intro to Building Apps](/developer/apps/intro-to-building-apps.md). Refer to that tutorial as necessary.

## Before You Begin

To initialize the SDK environment, open your command line terminal, navigate to the directory where you extracted the SDK. This might be a location like `/home/Bart/Downloads/dx-toolkit`.

```shell
source environment
```

This places the DNAnexus client scripts in your executable `PATH`, and the DNAnexus Python libraries in the Python library path.

Next, type the following command to log onto the Platform and select a project to work in.

```shell
dx login
```

Source code for the example apps used in this tutorial can be found in the `doc/examples/dx-apps` directory of the SDK. You can also browse the example programs [on GitHub](https://github.com/dnanexus/dx-toolkit/tree/master/doc/examples/dx-apps).

## Revisiting the Quality Trimmer

Start by recreating the quality trimmer app from [Intro to Building Apps](/developer/apps/intro-to-building-apps.md) into a more idiomatic Python app.

Run the command-line DNAnexus App Wizard (`dx-app-wizard`). Use the App Wizard to create Python apps:

```shell
$ dx-app-wizard
⋮
App Name: python_trimmer_example
⋮ (<ENTER> to accept defaults)

Input Specification

You will now be prompted for each input parameter to your app.
Each parameter should have a unique name that uses only the underscore (`_`) and alphanumeric characters, and does not start with a number.

1st input name (<ENTER> to finish): input_name
Label (optional human-readable name) []: Input file
Your input parameter must be of one of the following classes:
applet         array:file     array:record   file           int
array:applet   array:float    array:string   float          record
array:boolean  array:int      boolean        hash           string

Choose a class (<TAB> twice for choices): file
This is an optional parameter [y/n]: n

2nd input name (<ENTER> to finish): <ENTER>

Output Specification

You will now be prompted for each output parameter of your app.  Each parameter should have a unique
name that uses only the underscore (`_`) and alphanumeric characters, and does not start with a
number.

1st output name (<ENTER> to finish): output_file
Label (optional human-readable name) []: Output file
Choose a class (<TAB> twice for choices): file

2nd output name (<ENTER> to finish): <ENTER>

Template Options

You can write your app in any programming language. Templates are available for the
following supported languages: Python and bash
Programming language [Python]: <ENTER>
⋮
Execution pattern [basic]: <ENTER>
⋮
```

### Generated Files

Open up the generated metadata file, `dxapp.json`. The **run specification** specifies what code your app is to run and how it should be invoked. In this case the `runSpec.file` field refers to a file `src/python_trimmer_example.py`. The specified file is executed whenever you run your app.

This file was automatically generated by `dx-app-wizard`. You can see that it includes a skeleton that handles retrieving your input files from the platform to the local filesystem, and uploading the output files after your analysis has run.

Under the line that says "Fill in your application code here", add the following line to do your analysis:

```python
subprocess.check_call("fastq_quality_trimmer -t 20 -Q 33 -i input_name -o output_file", shell=True)
```

Also import the `subprocess` module (just add `import subprocess` underneath the other imports at the top of the file).

Your `python_trimmer_example.py` file looks like the following:

```python
#!/usr/bin/env python3
# python_trimmer_example 1.0.0
#
# Some comments have been abbreviated here; create an app using dx-app-wizard
# or look in dx-toolkit/doc/examples/dx-apps/python_trimmer_example to read
# the comments in full.

import os
import dxpy
import subprocess

@dxpy.entry_point('main')
def main(input_name):

    # Create DXDataObject handlers for the input object(s).

    input_name = dxpy.DXFile(input_name)

    # Download the file to the local filesystem.

    dxpy.download_dxfile(input_name.get_id(), "input_name")

    # Fill in your application code here.

    subprocess.check_call("fastq_quality_trimmer -t 20 -Q 33 -i input_name -o output_file", shell=True)

    # Upload the output file (presumed to now exist at the path output_file)
    # back to the platform.

    output_file = dxpy.upload_local_file("output_file");

    # Returns a reference to the file object you created.
    # See https://documentation.dnanexus.com/developer/api/

    output = {}
    output["output_file"] = dxpy.dxlink(output_file)

    return output

dxpy.run()
```

The app inputs are listed as keyword arguments to the `main` entry point function, which is executed when you run the app. The return value of this function should be a hash that contains the names and values of your app's output parameters.

* Inputs that are DNAnexus [data objects](/developer/api/introduction-to-data-object-classes.md) are represented as [`dicts`](https://docs.python.org/library/stdtypes.html#mapping-types-dict) containing [DNAnexus links](/developer/api/running-analyses/job-input-and-output.md#data-object-links). These can be passed as inputs to a handler class to construct a handler object (such as `dxpy.DXFile(input_name)` above, or with [`dxpy.get_handler()`](http://autodoc.dnanexus.com/bindings/python/current/dxpy_functions.html?highlight=get_handler#dxpy.bindings.dxdataobject_functions.get_handler)), or reduced to the string containing the object ID: `input_name['$dnanexus_link']`.
* Inputs of basic types (int, float, string, boolean, or hash) are given directly as the corresponding Python data types.
* Outputs that are data objects should be given as DNAnexus links, which can be constructed from handler objects or ID strings using [`dxpy.dxlink()`](http://autodoc.dnanexus.com/bindings/python/current/dxpy_functions.html?highlight=get_handler#dxpy.bindings.dxdataobject_functions.dxlink). Outputs of basic types should be given using their Python data types.

To complete your app, download the [FASTX-Toolkit](http://hannonlab.cshl.edu/fastx_toolkit/fastx_toolkit_0.0.13_binaries_Linux_2.6_amd64.tar.bz2), extract it, and put the `fastq_quality_trimmer` executable into the `resources/usr/bin` subdirectory of your app directory. Then download the DNAnexus-provided [sample reads file `small-celegans-sample.fastq`](https://dl.dnanex.us/F/D/Bp43z7pb2JX8jpB035j4424Vp4Y6qpQ6610ZXg5F/small-celegans-sample.fastq) and upload it to a project if you have not already:

```shell
dx upload small-celegans-sample.fastq
```

### Building and Running an App on the Platform

Next, upload your app to the DNAnexus Platform. In the app's directory, run:

```shell
dx build -a .
```

When loading your app the second and subsequent times, also pass the `--overwrite` or `-f` flag to request the removal of old versions of your app.

Run the app on the Platform. This instantiates a new **job**. When the job has been enqueued, `dx run` prints a **job ID** you can use to track progress.

```shell
$ dx run python_trimmer_example -iinput_name=small-celegans-sample.fastq
            # Inspect the input parameters and press ENTER to confirm...
⋮
Calling applet-xxxx with output destination project-yyyy:/

Job ID: job-zzzz
```

During or after the execution of your job, you can check its status with `dx describe JOB_ID`. This command shows the outputs of the job once the job has finished (if successful).

```shell
$ dx describe job-zzzz
Result 1:
ID              job-zzzz
⋮
State           running
⋮
Input           input_file = project-xxxx:file-yyyy
Output          -
```

Congratulations! You've run your first app on the DNAnexus Platform.

Many common bioinformatics pipelines can be represented by steps that each have the pattern illustrated above (which is the easiest way to take a preexisting analysis and make it run as a DNAnexus app or applet):

* Download inputs from the Platform using the API bindings and save them to local files in your execution container.
* Shell out to a subprocess to run whatever analysis you like, producing local files as output.
* Upload outputs from the local files you've produced back into the Platform, again using the API bindings.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.dnanexus.com/developer/apps/python.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
