# Using Dataset Extender

{% hint style="info" %}
An Apollo license is required to use Dataset Extender on the DNAnexus Platform. Org approval may also be required. [Contact DNAnexus Sales](mailto:sales@dnanexus.com) for more information.
{% endhint %}

## Adding Derived Phenotypes to an Existing Entity

1. Identify your dataset to extend. If you are using the command line, ensure that you retrieve the record id
2. To add data to an existing entity, ensure the following conditions are met
   1. The data is related to the entity in a one-to-one relationship
   2. The data has the unique keys for the entity you are extending, preferably in the first column
   3. Your column names do not overlap with any of the column names in the entity you are extending (excluding the column key, those can overlap).
3. Save the data as a file in your project. DNAnexus recommends saving it as comma-delimited, but tab-delimited data is also supported with an extra input configuration.
4. Run the [Dataset Extender application](https://platform.dnanexus.com/app/dataset-extender) with the following inputs
   1. **Source Data -** This should be set to your data file
   2. **Target Dataset -** This is the dataset you want to extend
   3. **Target Entity Name -** Only specify this if you are extending an entity that is not the main entity
   4. **Source Data Delimiter** - Select `"\t"` if you are using a TSV. The default is `","`.
   5. When running through dx-toolkit, you can use a pattern as follows:

      `dx run dataset-extender -isource_data=<file path> -itarget_dataset=<record id>`
   6. For additional configuration guidance refer to the [Dataset Extender](/developer/ingesting-data/dataset-extender.md) page
5. This process generates:
   1. A new dataset with the original data plus your new data
   2. A new database if the original database cannot be written to

## Adding a New, Related Entity to a Dataset

1. Identify the dataset you want to extend. If you are using the command line, ensure that you retrieve the record id
2. To add data as a new entity, ensure the following conditions are met
   1. The data is related to the entity in a one-to-one or many-to-one relationship
   2. The data has a column with values that correspond to the keys for the entity you are extending, preferably this is in the first column
3. Save the data as a file in your project. DNAnexus recommends saving it as comma-delimited, but tab-delimited data is also supported with an extra input configuration.
4. Run the [Dataset Extender application](https://platform.dnanexus.com/app/dataset-extender) with the following inputs
   1. **Source Data -** This should be set to your data file
   2. **Target Dataset -** This is the dataset you want to extend
   3. **Build New Entity** - This needs to be changed to `true`
   4. **New Entity Name** - The name of the new entity you are creating. This cannot overlap with any other entity title in the **Target Dataset**
   5. **Target Entity Name -** Only specify this if you are extending an entity that is not the main entity
   6. **Source Data Delimiter** - Select `"\t"` if you are using a TSV. The default is `","`.
   7. When running through dx-toolkit, you can use a pattern as follows:

      `dx run dataset-extender -isource_data=<file path> -itarget_dataset=<record id> -ibuild_new_entity=true -inew_entity_name=<entity name>`

      Add `-itarget_entity_name=<entity title the data relates to>` only when you are extending an entity other than the main entity.
   8. For additional configuration guidance refer to the [Dataset Extender](/developer/ingesting-data/dataset-extender.md) page
5. This process generates:
   1. A new dataset with the original data plus your new data
   2. A new database if the original database cannot be written to


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.dnanexus.com/developer/ingesting-data/dataset-extender/dataset-extender-usage.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
