Cohort Browser

NOTE: Not all features are included in all packages. Please contact sales@dnanexus.com for more information.

Overview

DNAnexus Apollo builds on the technological foundation of the core DNAnexus platform to offer scientists and bioinformaticians an environment to store and query large sets of genomic, phenotypic, multi-omic, and other structured data. Researchers can bring their data to the platform and leverage DNAnexus apps to ingest the data into queryable databases.

These databases can then be explored using the Cohort Browser. Scientists can filter the set of samples by any field within the dataset and save these filtered samples as cohort objects. These cohort objects can be shared with other scientists and also can be used as inputs to analysis apps to perform such tasks as calculating allele frequencies or performing a GWAS analysis.

Bioinformaticians who wish to perform ad hoc statistical analysis are able to spin up Jupyter Lab environments backed by Spark clusters to directly query their data and create dataframes within a Python or R environment for further analysis.

Launching the Cohort Browser

First find a database object in your project by adding a class filter and selecting option 'Database'. Next select a database and click Explore Data to launch the Cohort Browser.

Create Cohorts

Your dataset will start with a default view that may or may not contain some database fields presented as charts.

Finding and adding fields of interest

To add a field or chart, click Add Tile. The Add Tile dialog shows a hierarchical view of all the fields present within the dataset. Browse the list or search on an item to find fields of interest. The search function searches both keywords within a field name as well as within field values.

Selecting a field will display its metadata and charting options. Different field types may contain different types of metadata, different field options and different charting options.

Add Tile

Use the Add as Tile button to add fields of interest to your dashboard. Add as many tiles as you like and then close the window.

Customize the Dashboard

The dashboard section displays your tiles. With tiles you can:

  • rearranged them by dragging tiles to a new position on the screen

  • resized them by dragging from the lower right corner

  • removed them by clicking the "x"

  • display their metadata -- located in the "i" section

  • review their filter by opening the filter icon

Example dashboard of tiles

Filter the Data

Click on the charts to create filters.

  • Bar chart: click on a bar to include it in your filter criteria. For multi-select fields, you can toggle the "match any" to "match all" to constrain your filter.

  • Histogram: drag-select to choose a range of values for your filter.

  • List boxes: click items to include them in your filter. Click a parent folder to include all the children values. Use the search box to find specific items in long lists. Clear the search box to return to the full list view.

Examples of filters on charts

After you create a filter the cohort count will update. To refresh all the data on the dashboard, click refresh dashboard.

Filters can be removed by either removing them through the filter menu on the chart, or by removing the filter pill itself.

Filter Pills

Genomic Filters

Click Edit Genomic Filter to view options for filtering the cohort based on variant status:

  • Filter by gene and variant effect — specify a gene of interest, then select the transcribed variant effects to retain in the filter. For example, this type of filter would be useful to keep or exclude like loss-of-function variants in a particular gene.

  • Filter by variant ID (RSID or allele coordinate) — specify the variants of interest directly to retain individuals with any of these variants in the cohort. For example, this filter can focus on a known target, or a list of the top hits from a previous GWAS.

Genomic Filter Dialog

Saving Cohorts

You can save the current cohort as a record object inside of the current project by clicking the save button. The cohort object saves the precise set of filters that were used to generate the cohort as well as the layout of your dashboard chart tiles.

Most cohorts can be created in the UI. If a combination of complex ANDs and ORs are desired, Complex Cohorts can be created from the command line.

Dashboard Views

You can create different dashboard views to enable you to see different groupings of phenotypic fields. Once you have a set of tiles that you'd like to save, use the Dashboard Options menu to save the dashboard view. These view are saved as a file. When you want to open a view, find the file in the Browse Views dialog.

Download Data

To download data from tables select the items you'd like to download (or click the select-all box to include the entire table) and then click the download button.

Note: Please note that downloading data via visualization UI is not suitable for large datasets. You can use SQL Runner app to download data in a more efficient way.

Analyzing Datasets

DNAnexus Apollo currently provides a number of apps that can be used to analyze your cohorts, including the ability to calculate Allele Frequency and perform GWAS analyses.

You can start an analysis from any accessible project or directly from the list of all available apps. The visualization UI enables a shortcut for running an analysis with a saved cohort as an input. You can start an analysis by selecting Start Analysis. The button is disabled the cohort needs to be saved first.

The analysis selection dialog contains only available executions with cohort as an input. Select the analysis you would like to run.

The cohort is set automatically as an input. You need to set all other required inputs specified in the Analysis Inputs tab, select an instance type and an output folder on the App Settings tab. You can change the name and project of your execution.

App Runner Example

You can monitor your started analysis in the Monitor tab of the project in the context of which you launched the job. Once the job finishes, you can view all of its information, including inputs, outputs and log.

Monitor page for a GWAS job

If a job produces a result database, the job output will contain a link to the visualization UI for further exploration of data from the result database.

Example visualization of app results

Viewing Genomic Content

A dashboard may contain a section that displays the variants in the cohort. Some dashboards include a chart with the allele frequency. Use the search box in the chart to zoom in on a specific range. You can search by inputting genomic coordinates or a gene name. The results will also update the table contents. The table of variants can filtered and sorted to find variants of interest. To see variants details (transcripts, annotations, etc.) click on the link in the Location column.