Apollo Datasets
Learn about Apollo Datasets, how they're constructed, and how to use them.
An Apollo Dataset is a DNAnexus record of type dataset. It contains both data and metadata, and describes how logical data structures (such as phenotypes and genotypes) relate to the physical layout of the underlying databases.
You can use a Dataset to combine different data types—such as phenotypic, clinical, genomic, or transcriptomic data—across multiple databases into a single, linked, and documented object. Each Dataset record also stores provenance information about how it was created.
This structure makes it easier to:
Reuse data in the same or different experiments
Model and share multi-omic datasets
Scale to large projects (such as TCGA or UKB)
Maintain data linkages and annotations for reproducibility
Build tools using a predictable, well-documented framework
Example uses of a Dataset include:
Last updated
Was this helpful?