Analyzing Germline Variants
Analyze germline genomic variants, including filtering, visualization, and detailed variant annotation in the Cohort Browser.
Explore and analyze datasets with germline data by opening them in the Cohort Browser and switching to the Germline Variants tab. You can create cohorts based on germline variants, visualize variant patterns, and examine detailed variant information.
Filtering by Germline Variants
You can define your cohort to include only samples with specific germline variants.
To apply a germline filter to your cohort:
For the cohort you want to edit, click Add Filter.
In Add Filter to Cohort > Assays > Genomic Sequencing, select a genomic filter.
In Edit Filter: Variant (Germline), specify your filtering criteria:
For datasets with multiple germline variant assays, select the specific assay to filter by.
On the Genes / Effects tab, select variants of specific types and variant consequences within the specified genes and/or genomic ranges. You can specify up to 5 genes or genomic ranges in a comma-separated list.
On the Variant IDs tab, specify a list of variant IDs, with a maximum of 100 variants.
To enter multiple genes, genomic ranges, or variants, separate them with commas or place each on a new line.
Click Apply Filter.

Exploring Variant Patterns in Your Cohort
The Germline Variants tab includes a lollipop plot displaying allele frequencies for variants in a specified genomic region. This visualization helps you identify patterns in germline variants across your cohort and understand the distribution of allelic frequencies.

Examining Variant Annotations
The allele table, located below the lollipop plot, shows the same variants in a tabular format with comprehensive annotation information. It allows you to examine specific variant characteristics and compare allele frequencies within your selected cohort, the entire dataset, and from annotation databases, including gnomAD.
The annotation information includes:
Type: whether the variant is an SNP, deletion, insertion, or mixed.
Consequences: The impact of variant according to SnpEff. For variants with multiple gene annotations, this column displays the most severe consequence per gene.
Population Allele Frequency: Allele frequency calculated across entire dataset from which the cohort is created.
Cohort Allele Frequency: Allele frequency calculated across current cohort selection.
GnomAD Allele Frequency: Allele frequency of the specified allele from the public dataset gnomAD.
If canonical transcript information is available, the following three columns with additional annotation information appear in the Table:
Consequences (Canonical Transcript): Canonical effects per each associated gene, according to SnpEff.
HGVS DNA (Canonical Transcript): HGVS (DNA) standard terminology per each associated gene with this variant
HGVS Protein (Canonical Transcript): HGVS (Protein) standard terminology per each associated gene with this variant
Exporting Variant Metadata
You can export the selected variants in the table as a list of variant IDs or a CSV file.
To copy a comma-separated list of variant IDs to your clipboard, select the set of IDs you want to copy, and click Copy.
To export variants as a CSV file, select the set of IDs you need, and click Download (.csv file).
For large datasets, you can use the SQL Runner app to download data in a more efficient way.
Accessing Detailed Variant Information
In Allele table > Location column, you can click on the specific location to open the locus details. The locus details provides in-depth annotations and population genetics data for the selected genomic position.

The locus details page displays three main sections of pre-calculated information from dataset ingestion: Location Summary, Genotype Distribution, and Allele Annotations. These sections provide a comprehensive view starting with a locus summary, including genotype frequencies, followed by detailed annotations for each allele.
Location Info provides a quick overview of the genomic locus in your dataset, including the chromosome and starting position, the frequency of both the reference allele and no-calls, and the total number of alleles available.
Genotypes shows a detailed breakdown of genotypes in the dataset at the specific location. Since allele order is not preserved, genotypes like C/A and A/C are counted in the same category, which is why only half of the comparison table is populated. These genotype frequencies represent the entire dataset at this location, not just your selected cohort.
Alleles displays detailed information for each allele, collected from dbSNP and gnomAD during data ingestion. When available, rsID or AffyID appear with direct links to the corresponding NCBI dbSNP page. The section provides allele type, affected samples (dataset), and gnomAD frequency for quick reference, with additional details sorted by transcript ID. For canonical transcripts, a blue indicator appears next to the transcript ID, identifying the primary transcript annotations.
Integrating with Advanced Analysis Tools
For more sophisticated genomic analysis beyond the Cohort Browser's visualization capabilities, you can connect your variant data with other DNAnexus tools. Export variant lists for detailed analysis in JupyterLab, leverage Spark clusters for large-scale genomic computations, or connect to SQL Runner for complex queries across your dataset.
Last updated
Was this helpful?