Cohort Browser

Overview

DNAnexus Apollo builds on the technological foundation of the core DNAnexus platform to offer scientists and bioinformaticians an environment to store and query large sets of genomic, phenotypic, multi-omic, and other structured data. Researchers can bring their data to the platform and leverage DNAnexus apps to ingest the data into queryable databases.

These databases can then be explored using the Cohort Browser. Scientists can filter the dataset by any data field and save these filtered samples as cohorts. These cohorts can be shared with other scientists and also can be used as inputs to analysis apps to perform such tasks as calculating allele frequencies or performing a GWAS analysis.

Bioinformaticians who wish to perform ad hoc statistical analysis are able to spin up JupyterLab environments backed by Spark clusters to directly query their data and create dataframes within a Python or R environment for further analysis.

Not all mentioned features on this page are included in all packages. For more information on feature packages please contact sales@dnanexus.com.

Access Datasets

Datasets need to be prepared and ingested in order to be used by Cohort Browser. See Ingesting Data page for further information on how to ingest datasets on the DNAnexus platform,

From the project where a dataset is located, go to Manage tab and select your dataset of interest. Click on "Explore Data" action to open this dataset in Cohort Browser.

Select dataset of interest and explore data

You can also access datasets via the Datasets page, which is located under the Projects menu. The Datasets page displays all datasets you have access to, and enables you to browse and find a specific dataset without navigating through projects.

You can use the optional information panel to view further information about a selected dataset, including creator, sponsorship, etc.

View detailed information on a selected dataset

Explore Data

The Dashboard area in Cohort Browser provides insights on the data by visualizing various data fields.

Dashboard visualizations

Add Data Field of Interest

  1. To add a field as a chart, click on Add Tile button. The Add Tile dialog shows a hierarchical view of all the data fields available in the dataset.

  2. Browse the list or search an item by its title to narrow down the list.

  3. Select a data field from the list. In the Data Field Details panel, you can see metadata on the selected data field, visualization preview, as well as options to customize chart types.

  4. Confirm selection via the "Add as Tile" action.

Adding a tile to dashboard

Chart types available per each data field is dependent on their data field types. See Chart Types pages for more information on how each chart type can be built.

Create Multi-Variable Charts

Once you have selected a primary data field in the Add Tile dialogue, you can add a secondary data field by clicking on the "+" icon that appears next to a data field item.

For certain chart types such as Stacked Row Chart and Scatter Plot, you can re-order the primary and secondary data fields by dragging on the data field in the Data Field Details section.

Adding grouped box plot by combining two data fields

Cohort Filtering

When you start exploration on a dataset, an empty cohort is created automatically in the cohort browser. You can further narrow down your cohort by adding cohort filters. Cohorts created can be saved and exported for later use.

Add Cohort Filter

  1. From the cohort which you wish to edit, click on "Add Filter" button.

  2. In the "Add Filter" dialogue, select a data field you want to filter by, confirm by clicking on "Add as Filter".

  3. Select operators and enter values to filter by. Click on "Apply Filter" to confirm.

  4. Filters added are displayed in corresponding cohort panels. You can edit a specific filter any time by clicking on it, which would bring up the Edit Filter dialogue.

Adding Cohort Filter

Once filters are added or edited, an updated cohort size will appear under name of the affected cohort. The cohort browser will also auto-refresh to fetch updated results basing on latest cohort selection.

Cohort size updates after filters are edited

Add Genomic Filter

  1. From the cohort you wish to edit, click on "Add Filter" button.

  2. in the "Add Filter" dialogue, toggle to "Geno" tab.

  3. Edit filter in "Edit Genomic Filter" dialogue by one of the following criteria:

    1. Filter by genes and variant effects: Filter your dataset by variants of certain types and consequences within specified genes and / or genomic ranges. A maximum of 5 genes / ranges can be entered.

    2. Filter by a list of variant IDs. A maximum of 100 variants can be entered.

  4. Confirm edit by clicking on "Apply Geno Filter" button

Note: Same as other cohort filters, genomic filter is applied to the main entity of your dataset (in most cases, patients or participants).

Cohort Table

Cohort Table shows records that are within your current cohort selection. You can add or remove data fields as columns via the column customization menu, which is located in the top-right corner of the table.

Customize table columns

Click on table column headers to access more functionalities including sorting and searching in a specific column.

Sorting and searching in a table column

Export Cohort Table

You can export table information either as a list of record IDs or a csv file. Export options are available on the top-right corner of the table once you have selected a number of table rows.

Note: The cohort table can display a maximum number of 30,000 records. If your cohort size is larger than this number, the table may not show the full data.

Variant Browser

Variant Browser shows variants that are present in current cohort selection. This section includes a lollipop chart displaying allele frequencies for variants in a specified genomic region.

You can modify the genomic region via the search bar on the top-right corner of the variant section. This genomic region will update information in both lollipop chart and table.

Variant Browser and Table

The table below the lollipop chart lists the same variants in tabular format, along with further annotation information including:

  • Type: whether the variant is a SNP, deletion, Insertion, or mixed

  • Consequences: The impact of variant according to SNPEff. For variants with multiple gene annotations, this column displays the most severe consequence per gene.

  • Population Allele Frequency: Allele frequency calculated across entire dataset from which the cohort is created.

  • Cohort Allele Frequency: Allele frequency calculated across current cohort selection.

  • GnomAD Allele Frequency: Allele frequency of specified allele from the public dataset GnomAD.

To view further annotation information, you can go to the detail page of a given variant by clicking on the link in the Location column .

Note: Downloading genomic data via visualization UI is not suitable for large datasets. You can use the SQL Runner app to download data in a more efficient way.

Export Variant Table

You can export selected variants in the table as a list of variant IDs or a csv file. Export options will appear at the top-right corner of the table once you have items selected.

Save Cohort

You can save your cohort selection to a project as a Class: Record object by clicking the Save icon in the top-right corner of the cohort panel.

Cohorts will be saved with the filters applied, along with the latest set of visualizations and dashboard layout information. Similar to Dataset objects, Cohort objects can be found under the Manage tab in your selected projects, and can be re-opened via the Explore Data option.

Save cohort action
Cohort object in project folders

Export Cohort

You can export a list of main entity IDs in your current cohort selection as a csv file. This action can be found next to the Save Cohort action, on the top-right corner of cohort panel.

Dashboard Views

Dashboard Views contain layout and configuration information that can be re-used during cohort browsing. You can save / load a dashboard view via the "Views" option menu located at the top-right corner of the header area.

Dashboard views option menu

Dashboard views are saved as Type: DashboardView objects, which once saved also show up in selected project folders.

Cohort Compare

You can compare two cohorts by adding both cohorts into the Cohort Browser. In cohort compare mode, all visualizations are converted to show data from both cohorts.

The "Compare Cohort" action can be found in the header area next to the cohort title. You can create a new cohort, duplicate the current cohort, or load a previously saved cohort.

In compare mode, you can continue to edit both cohorts and visualize the results dynamically.

Note: Compare mode is supported only for cohorts created from the same dataset. Certain cohort browser sections and chart types are not supported in compare mode.