> For the complete documentation index, see [llms.txt](https://documentation.dnanexus.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://documentation.dnanexus.com/getting-started/developer-tutorials/concurrent-computing-tutorials/parallel/parallel-xargs-by-chr.md).

# Parallel xargs by Chr

[View full source code on GitHub](https://github.com/dnanexus/dnanexus-example-applets/tree/master/Tutorials/bash/samtools_count_para_chr_xargs_sh)

## How is the SAMtools dependency provided?

The SAMtools compiled binary is placed directly in the `<applet dir>/resources` directory. Any files found in the `resources/` directory are uploaded so that they are present in the root directory of the worker. In this case:

```
├── Applet dir
│   ├── src
│   ├── dxapp.json
│   ├── resources
│       ├── usr
│           ├── bin
│               ├── < samtools binary >
```

When this applet is run on a worker, the `resources/` folder is placed in the worker's root directory `/`:

```
/
├── usr
│   ├── bin
│       ├── < samtools binary >
├── home
│   ├── dnanexus
```

`/usr/bin` is part of the `$PATH` variable, so in the script, you can reference the `samtools` command directly, as in `samtools view -c ...`.

## Parallel Run

### Splice BAM

First, download the BAM file and slice it by canonical chromosome, writing the `*bam` file names to another file.

To split a BAM by regions, you need to have a `*.bai` index. You can either create an app(let) which takes the `*.bai` as an input or generate a `*.bai` in the applet. In this tutorial, the `*.bai` is generated in the applet, sorting the BAM if necessary.

```shell
dx download "${mappings_bam}"

indexsuccess=true
bam_filename="${mappings_bam_name}"
samtools index "${mappings_bam_name}" || indexsuccess=false
if [[ $indexsuccess == false ]]; then
  samtools sort -o "${mappings_bam_name}" "${mappings_bam_name}"
  samtools index "${mappings_bam_name}"
  bam_filename="${mappings_bam_name}"
fi

chromosomes=$( \
  samtools view -H "${bam_filename}" \
  | grep "\@SQ" \
  | awk -F '\t' '{print $2}' \
  | awk -F ':' '{if ($2 ~ /^chr[0-9XYM]+$|^[0-9XYM]/) {print $2}}')

for chr in $chromosomes; do
  samtools view -b "${bam_filename}" "${chr}" -o "bam_${chr}.bam"
  echo "bam_${chr}.bam"
done > bamfiles.txt
```

### Xargs SAMtools view

In [Splice BAM](#splice-bam), the name of each sliced BAM file was stored in a record file. Next, perform a `samtools view -c` on each slice using the record file as input.

In the following example, `$view_options` represents any optional additional flags you want to pass to `samtools view`.

```shell
counts_txt_name="${mappings_bam_prefix}_count.txt"

sum_reads=$( \
  <bamfiles.txt xargs -I {} samtools view -c $view_options '{}' \
  | awk '{s+=$1} END {print s}')
echo "Total Count: ${sum_reads}" > "${counts_txt_name}"
```

### Upload results

The results file is uploaded using the standard bash process:

1. Upload a file to the job execution's container.
2. Provide the DNAnexus link as a job's output using the script `dx-jobutil-add-output <output name>`

   ```shell
     counts_txt_id=$(dx upload "${counts_txt_name}" --brief)
     dx-jobutil-add-output counts_txt "${counts_txt_id}" --class=file
   ```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://documentation.dnanexus.com/getting-started/developer-tutorials/concurrent-computing-tutorials/parallel/parallel-xargs-by-chr.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
