Creating a Mixed Phenotypic Assay Dataset
Learn to created an Apollo dataset incorporating both phenotypic and assay data.
The following is an example of the process to create a mixed phenotypic and assay Apollo Dataset. You have an existing longitudinal clinical Dataset, "Initial Trial", and recently received two new sets of data from a lab experiment, "Experiment A". The two sets of data are two data files with information from the patient’s (New data 1) visit and the visit’s (New data 2) lab tests and also a gene expression matrix for the (New data 3) molecular data which measures mRNA expression levels from isolated tissues corresponding to the visit.
As the data administrator, you want to add the two new sets of data to the existing Dataset and update your sample cohort, "Demo", to reference the newly expanded Dataset.
Since the new lab measurements are complex and contain information for visits and a series of measurements for each visit, use the Data Model Loader app to ingest the data, ensuring that your “Visit” Entity contains a
patient_id, and create a Dataset named "RNAseq study lab measurements". This Dataset now contains two Entities.
Using the Molecular Expression Assay Loader app, ingest the gene expression matrix and create a Dataset named "RNAseq gem". This Dataset contains Molecular Expression assay data, and a single Entity that contains sample ID information.
Using the Clinical Dataset Merger app, now link the "RNAseq study lab measurements" Dataset to the target "Initial Trial" Dataset to create an output Dataset named "Initial trial with RNAseq (pheno only)". Do this by adding the "RNAseq study lab measurements" Dataset as the Source Dataset input parameter and the "Initial Trial" Dataset as the Target Dataset input parameter.
This new Dataset now has the original two Entities and the two additional Entities merged in.
Using the Assay Dataset Merger, use the "RNAseq gem" Dataset as the Source Dataset input parameter and the "Initial trial with RNAseq (pheno only)" Dataset as the Target Dataset input parameter to create an output Dataset named "Initial trial with RNAseq". This new Dataset now has the new measurements as Entities and the gene expression matrix linked as an assay.
Before sharing the Dataset you still need to update "Demo" cohort so that it opens with the new Dataset. Using the Rebase Cohorts And Dashboards, rebase the "Demo" cohort onto the "Initial trial with RNAseq" Dataset using the Cohorts input parameter and generate a "Demo RNAseq" cohort with the suffix "RNAseq". Now the new Dataset, ingested databases, and cohort are ready to be shared and used for analysis.