Analyzing Gene Expression Data

Analyze gene expression data, including expression-based filtering, visualization, and molecular profiling in the Cohort Browser.

An Apollo license is required to use Cohort Browser on the DNAnexus Platform. Org approval may also be required. Contact DNAnexus Sales for more information.

Explore and analyze datasets with gene expression assays by opening them in the Cohort Browser and switching to the Gene Expression tab. You can create cohorts based on expression levels, visualize expression patterns, and examine detailed gene information.

Gene expression datasets are created using the Molecular Expression Assay Loader.

Customizing the Gene Expression Dashboard

You can customize your Gene Expression dashboard to focus on the most relevant analyses for your research:

  • Create new Expression Distribution or Feature Correlation charts.

  • Remove charts you no longer need.

  • Resize and reposition charts to optimize your workspace.

  • Save your dashboard customizations along with your cohort.

Visualizing gene expression data in Cohort Browser

The Gene Expression dashboard supports up to 15 charts, allowing you to create comprehensive expression analysis workspaces.

For datasets with multiple gene expression assays, you can choose the specific assay to visualize at the top of the dashboard. The Cohort Browser displays data from only one assay at a time. Switching between assays preserves your charts and their display settings.

Filtering by Gene Expression

You can define your cohort by gene expression to include only patients with specific expression characteristics.

To apply a gene expression filter to your cohort:

  1. For the cohort you want to edit, click Add Filter.

  2. In Add Filter to Cohort > Assays > Gene Expression, select a genomic filter.

  3. In Edit Filter: Gene Expression, specify the criteria:

    • For datasets with multiple gene expression assays, select the specific assay to filter by.

    • In Expression Level, specify inclusive minimum and maximum values. For an individual to be included, all their expression values across all samples for the feature must fall within the range.

    • In Gene / Transcript, enter a gene symbol, such as BRCA1, or feature ID, such as ENSG00000012048 or ENST00000309586. Search is case insensitive.

  4. Click Apply Filter.

Adding a gene expression filter

You can specify up to 10 gene expression filters for each cohort. All filters use an AND relationship.

Visualizing Expression Distribution

The Expression Level charts help you visualize gene expression patterns for individual transcript or gene features. You can examine how expression values are distributed across your cohort, identify outliers, and compare patterns between different patient groups.

The chart displays data for one gene or transcript at a time. You can directly enter a transcript or gene feature ID, such as ID starting with [ENST](https://useast.ensembl.org/Help/View?id=151) or [ENSG](https://useast.ensembl.org/info/genome/genebuild/index.html), or search by gene symbol to see available options.

Visualizing TP53 gene expression

You can view the data as either a histogram showing frequency distribution or a box plot displaying quartiles and outliers. To switch between these views or adjust display statistics, click ⛭ Chart Settings.

When comparing cohorts, the chart shows data from each cohort on the same axes for direct comparison.

You can also customize your charts by selecting different transcript or gene features, resizing and rearranging them on your dashboard, or adjusting display settings to focus on the most relevant analyses for your research.

Exploring Feature Correlations

The Feature Correlation charts help you understand how the expression levels of two genes or transcripts relate to each other. You can use these charts to identify genes or transcripts that are co-expressed, explore potential pathway interactions, and compare correlation patterns between different cohorts.

The chart displays a scatter plot where each point represents a sample, with the X and Y axes showing expression values for your two selected features. A best fit line shows the overall relationship trend, and you can swap which gene appears on which axis to view the data from different perspectives.

Exploring feature correlations between ERBB2 and TP53

The correlation analysis includes statistical measures to help you determine if the relationship you're seeing is meaningful. The Pearson correlation coefficient shows both the strength and direction of the linear relationship (ranging from -1 to +1), while the p-value indicates whether the correlation is statistically significant.

You can toggle these statistics on or off as needed. The chart updates automatically when you change your feature selections or switch between viewing single cohorts versus comparing multiple cohorts. This quantitative analysis helps you assess whether observed correlations are both statistically sound and biologically relevant to your research.

Examining Detailed Gene Expression Information

The Expression Per Feature table provides gene metadata and expression statistics for all features in your dataset. Use the search bar to find specific genes by symbol or explore genes within genomic ranges.

The table displays one row per feature ID with the following columns:

  • Feature ID: The unique transcript or gene identifier, such as ENST for a transcript or ENSG for a gene

  • Gene Symbol: The official gene name or symbol associated with the feature ID, such as TP53

  • Location: The genomic coordinates in "chromosome:start-end" format

  • Strand: The DNA strand orientation (+ or -)

  • Expression (Mean): The average expression value for this feature across the current cohort

  • Expression (SD): The standard deviation of expression values

  • Expression (Median): The median expression value

When comparing cohorts, the table shows separate expression statistics for each cohort, allowing direct comparison of expression patterns.

Examining TP53 expression per feature

Each feature includes links to external annotation resources:

  • Ensembl transcript pages: Detailed transcript information and annotations

  • Ensembl gene pages: Comprehensive gene summaries and functional data

These links provide quick access to additional context about genes and transcripts of interest.

Last updated

Was this helpful?