Creating a Mixed Phenotypic Assay Dataset

Learn to created an Apollo dataset incorporating both phenotypic and assay data.

On the DNAnexus Platform, an Apollo license is required to use the features described on this page. Org approval may also be required. Contact DNAnexus Sales for more information.

Introduction

The following is an example of the process to create a mixed phenotypic and assay Apollo Dataset. You have an existing longitudinal clinical Dataset, "Initial Trial", and recently received two new sets of data from a lab experiment, "Experiment A". The two sets of data are two data files with information from the patient’s (New data 1) visit and the visit’s (New data 2) lab tests and also a gene expression matrix for the (New data 3) molecular data which measures mRNA expression levels from isolated tissues corresponding to the visit.

As the data administrator, you want to add the two new sets of data to the existing Dataset and update your sample cohort, "Demo", to reference the newly expanded Dataset.

Guide

Step 1: Ingest the Lab Measurements

Since the new lab measurements are complex and contain information for visits and a series of measurements for each visit, use the Data Model Loader app to ingest the data, ensuring that your “Visit” Entity contains a patient_id, and create a Dataset named "RNAseq study lab measurements". This Dataset now contains two Entities.

Step 2: Ingest the Gene Expression Matrix

Using the Molecular Expression Assay Loader app, ingest the gene expression matrix and create a Dataset named "RNAseq gem". This Dataset contains Molecular Expression assay data, and a single Entity that contains sample ID information.

Step 3: Add the Lab Measurements to the Initial Dataset

Using the Clinical Dataset Merger app, now link the "RNAseq study lab measurements" Dataset to the target "Initial Trial" Dataset to create an output Dataset named "Initial trial with RNAseq (pheno only)". Do this by adding the "RNAseq study lab measurements" Dataset as the Source Dataset input parameter and the "Initial Trial" Dataset as the Target Dataset input parameter.

This new Dataset now has the original two Entities and the two additional Entities merged in.

Step 4: Add the Gene Expression Matrix to the Expanded Dataset

Using the Assay Dataset Merger, use the "RNAseq gem" Dataset as the Source Dataset input parameter and the "Initial trial with RNAseq (pheno only)" Dataset as the Target Dataset input parameter to create an output Dataset named "Initial trial with RNAseq". This new Dataset now has the new measurements as Entities and the gene expression matrix linked as an assay.

Step 5: Rebase the Cohort on the New Dataset

Before sharing the Dataset you still need to update "Demo" cohort so that it opens with the new Dataset. Using the Rebase Cohorts And Dashboards, rebase the "Demo" cohort onto the "Initial trial with RNAseq" Dataset using the Cohorts input parameter and generate a "Demo RNAseq" cohort with the suffix "RNAseq". Now the new Dataset, ingested databases, and cohort are ready to be shared and used for analysis.

Last updated