# Analyzing Germline Variants

{% hint style="info" %}
An Apollo license is required to use Cohort Browser on the DNAnexus Platform. Org approval may also be required. [Contact DNAnexus Sales](mailto:sales@dnanexus.com) for more information.
{% endhint %}

Explore and analyze datasets with germline data by opening them in the Cohort Browser and switching to the Germline Variants tab. You can create cohorts based on germline variants, visualize variant patterns, and examine detailed variant information.

## Filtering by Germline Variants

You can [define your cohort](/user/cohort-browser/defining-cohorts.md#defining-cohort-criteria) to include only samples with specific germline variants.

To apply a germline filter to your cohort:

1. For the cohort you want to edit, click **Add Filter**.
2. In **Add Filter to Cohort** > **Assays** > **Genomic Sequencing**, select a genomic filter.
3. In **Edit Filter: Variant (Germline)**, specify your filtering criteria:
   * For datasets with multiple germline variant assays, select the specific assay to filter by.
   * On the **Genes / Effects** tab, select variants of specific types and [variant consequences](https://feb2023.archive.ensembl.org/info/genome/variation/prediction/predicted_data.html) within the specified genes and/or genomic ranges. You can specify up to 5 genes or genomic ranges in a comma-separated list.
   * On the **Variant IDs** tab, specify a list of variant IDs, with a maximum of 100 variants.
   * To enter multiple genes, genomic ranges, or variants, separate them with commas or place each on a new line.
4. Click **Apply Filter**.

![Adding a germline filter](/files/fi77n7ynFbNv1uVJLd8E)

{% hint style="info" %}
After you apply or edit filters, the participant count updates immediately. However, visualization tiles do not automatically refresh. Click **Refresh Visualizations** at the top of the dashboard to update all tiles. Click **Refresh** on individual tiles to update specific charts.
{% endhint %}

## Exploring Variant Patterns in Your Cohort

The **Germline Variants** tab includes a lollipop plot displaying allele frequencies for variants in a specified genomic region. This visualization helps you identify patterns in germline variants across your cohort and understand the distribution of allelic frequencies.

![Genomic Variant Browser and Details](/files/lMiJUovvTPf1PPFCUFHz)

{% hint style="info" %}
If your dataset contains multiple germline variant assays, such as WES and WGS assays, you can choose the assay to visualize at the top of the dashboard. The Cohort Browser displays data from only one assay at a time. When you switch between assays, your charts and their display settings are preserved.
{% endhint %}

### Examining Variant Annotations

The allele table, located below the lollipop plot, shows the same variants in a tabular format with comprehensive annotation information. It allows you to examine specific variant characteristics and compare allele frequencies within your selected cohort, the entire dataset, and from annotation databases, including gnomAD.

The annotation information includes:

* **Type:** whether the variant is an SNP, deletion, insertion, or mixed.
* **Consequences:** The impact of the variant according to [SnpEff](https://pcingola.github.io/SnpEff/). For variants with multiple gene annotations, this column displays the most severe consequence per gene.
* **Population Allele Frequency:** Allele frequency calculated across the entire dataset from which the cohort is created.
* **Cohort Allele Frequency:** Allele frequency calculated across the current cohort selection.
* **GnomAD Allele Frequency:** Allele frequency of the specified allele from the public dataset [gnomAD](https://gnomad.broadinstitute.org/).

If canonical transcript information is available, the following three columns with additional annotation information appear in the Table:

* **Consequences (Canonical Transcript)**: Canonical effects for each associated gene, according to SnpEff.
* **HGVS DNA (Canonical Transcript)**: HGVS (DNA) standard terminology for each associated gene with this variant
* **HGVS Protein (Canonical Transcript)**: HGVS (Protein) standard terminology for each associated gene with this variant

### Exporting Variant Metadata

You can export the selected variants in the table as a list of variant IDs or a CSV file.

* To copy a comma-separated list of variant IDs to your clipboard, select the set of IDs you want to copy, and click **Copy**.
* To export variants as a CSV file, select the set of IDs you need, and click **Download (.csv file)**.

{% hint style="success" %}
For large datasets, you can use the [SQL Runner app](/user/spark/example-applications/spark-sql-runner.md) to download data in a more efficient way.
{% endhint %}

## Accessing Detailed Variant Information

In the **Allele table** > **Location** column, you can click on the specific location to open the locus details. The locus details provide in-depth annotations and population genetics data for the selected genomic position.

{% hint style="info" %}
When genomic information is ingested and made available in the Cohort Browser, variants are annotated using [NCBI dbSNP](https://www.ncbi.nlm.nih.gov/snp/) and [gnomAD](https://gnomad.broadinstitute.org/). The specific versions of each are provided during the ingestion process, which creates a set of tables optimized for cohort creation through the Cohort Browser.
{% endhint %}

![Viewing specific locus details](/files/FB5B9BDL49kLG6YikSO4)

The locus details page displays three main sections of pre-calculated information from dataset ingestion: **Location Info**, **Genotypes**, and **Alleles**. These sections provide a comprehensive view starting with a locus summary, including genotype frequencies, followed by detailed annotations for each allele.

### Location Info

The **Location Info** section provides a quick overview of the genomic locus in your dataset, including the chromosome and starting position, the frequency of both the reference allele and no-calls, and the total number of alleles available.

### Genotypes

The **Genotypes** section shows a detailed breakdown of genotypes in the dataset at the specific location. Since allele order is not preserved, genotypes like C/A and A/C are counted in the same category, which is why only half of the comparison table is populated. These genotype frequencies represent the entire dataset at this location, not only your selected cohort.

### Alleles

The **Alleles** section displays detailed information for each allele, collected from dbSNP and gnomAD during data ingestion. When available, rsID or AffyID appear with direct links to the corresponding [NCBI dbSNP](https://www.ncbi.nlm.nih.gov/snp/) page. The section provides allele type, affected samples (dataset), and gnomAD frequency for quick reference, with additional details sorted by transcript ID in the **Genes / Transcripts** table. For canonical transcripts, a blue indicator appears next to the transcript ID, identifying the primary transcript annotations.

## Integrating with Advanced Analysis Tools

For more sophisticated genomic analysis beyond the Cohort Browser's visualization capabilities, you can connect your variant data with other DNAnexus tools. Export variant lists for detailed analysis in [JupyterLab](/user/jupyter-notebooks.md), use [Spark clusters](/user/spark.md) for large-scale genomic computations, or connect to [SQL Runner](/user/spark/example-applications/spark-sql-runner.md) for complex queries across your dataset.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.dnanexus.com/user/cohort-browser/analyzing-germline-variants.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
