Table Exporter
Learn to use the Table Exporter app to extract data from an Apollo Dataset, cohort, or dashboard into a delimited file for use in analysis, or download.
An Apollo license is required to use the Table Exporter App. You may also need org approval. Contact DNAnexus Sales for more information.

Overview

Overview of all file inputs for the Table Exporter app.

Using the Table Exporter App

Launching the App

To launch the app, enter this command via the command line:
dx run table-exporter

Inputs

The Table Exporter app requires as an input:
  • Dataset, Cohort, or Dashboard - the dataset, cohort, or dashboard that you want to extract from. This input must be a v3.0 version.
Additional Optional Inputs are:
  • Output File Name - a custom name for the file generated. The Output File Format will determine the file extension.
  • Output File Format - the CSV file format is the default which generates a comma "," separated file. TSV is also an option to generate a tab "/t" separated file.
  • Coding Option - "Replace" is the default that replaces fields with their coding value. If specified to "Raw", the raw value is exported. If specified to "Exclude", all coded values are excluded.
  • Entity - the name of the entity you would like to extract if you do not want the cohort table from the input Dataset, Cohort or Dashboard.
  • Field Titles - the field titles to export as a comma "," separated string. If this field is blank and the entity is specified, all fields on the entity are exported. The entity input must be specified if fields are provided.
  • See app documentation for further granular configurations.

Process

  1. 1.
    If an Entity is specified, the Entity and Field Titles are used to generate the exported file.
  2. 2.
    If an Entity is not specified, then:
    1. 1.
      If the input is a Dashboard or a Cohort, the columns specified in the cohort table are used to generate the exported file, 1 file per entity added to the dashboard.
    2. 2.
      If the input is a Dataset and it has a default dashboard, the columns specified in the cohort table of the default dashboard are used to generate the exported file, 1 file per entity added to the dashboard.
    3. 3.
      If the input is a Dataset without a default dashboard, the main entity and all of its fields are used to generate the export file.

Outputs

  • CSV/TSV file - the delimited file generated.
  • Logs - available under Project: .table-exporter/<job-id>-clusterlogs.tar.gz.
    • Spark cluster logs - for advanced troubleshooting.

Best Practices

  1. 1.
    For extremely large entities (thousands of columns with hundreds of thousands of rows), using "Replace" codings will significantly increase runtime and cost. It is recommended that in those instances you export without coding replacement.
  2. 2.
    If you are exporting on a dataset that has databases in a controlled project where DB UI View Only permission is set, the application must be run in the project with the restricted database to execute successfully.