# Omics Data Assistant

{% hint style="info" %}
A license is required to use Omics Data Assistant on the DNAnexus Platform. [Contact DNAnexus Sales](mailto:sales@dnanexus.com) for more information.
{% endhint %}

Omics Data Assistant (the assistant) is a GenAI-powered conversational interface that helps you explore and analyze complex biomedical and clinical datasets using natural language. The assistant is integrated directly into [Cohort Browser](https://documentation.dnanexus.com/user/cohort-browser). This means you can combine conversational queries with powerful visualization tools for comprehensive data analysis.

Whether you're new to a dataset or an experienced bioinformatician, the assistant saves you time by understanding your questions in plain English. New users can quickly discover what data their datasets contain without browsing through fields and schemas. Experienced users can define cohorts in seconds by describing criteria in a few sentences, eliminating the need to manually configure multiple filters.

{% hint style="warning" %}
Omics Data Assistant uses generative AI to accelerate your analysis. While powerful, AI models can occasionally produce inaccurate or incomplete results. Always verify generated cohorts and insights against your underlying data. The assistant alone should not be used for clinical diagnosis or treatment decisions.
{% endhint %}

## How It Works

Omics Data Assistant uses the latest Anthropic Claude model for natural language understanding and response generation. This model supports large context windows, allowing the assistant to handle complex queries and maintain context throughout conversations.

The assistant accesses only your Apollo dataset and does not connect to the internet or external data sources. Omics Data Assistant is deployed regionally to meet data residency requirements, keeping your data in your region throughout all operations. Conversations are stored securely and remain private to you.

## Getting Started

### Prerequisites

* Your organization has active Omics Data Assistant and Cohort Browser licenses.
* You have access to an Apollo dataset and its associated databases in a project.

### Opening Omics Data Assistant

Omics Data Assistant works with datasets in Cohort Browser.

1. In the DNAnexus Platform, [open a dataset in Cohort Browser](https://documentation.dnanexus.com/cohort-browser#opening-datasets-using-the-cohort-browser).
2. Open the assistant using one of these methods:
   * Click **✨ Omics Data Assistant** in the lower right corner to query the entire dataset.
   * Click **✨ Ask About Cohort** in the cohort panel to open the assistant with a specific cohort already in context.

![Open Omics Data Assistant in Cohort Browser by either clicking Ask About Cohort in the cohort panel or Omics Data Assistant in bottom right.](https://1612471957-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-L_EsL_ie8XyZlLe_yf9%2Fuploads%2Fgit-blob-b36c267da851146c9728f727f4d3f0c90a7ad1b0%2Fcohort-browser-omics-data-assistant-closed.png?alt=media)

By default, the assistant opens in a panel on the right side. You can enlarge the assistant panel by clicking **Enter Full Screen**.

In the assistant's input field, you can explore the data by [**asking questions**](#asking-questions).

Below the input, you can use two controls to get started quickly:

* **Dataset Overview**: Opens an AI-generated overview of the opened dataset. Use this to learn what the dataset contains without writing a prompt. You can still ask follow-up questions for specific details.
* **Help**: Opens a guide about Omics Data Assistant, its capabilities, and example prompts to try.

![Omics Data Assistant in full screen](https://1612471957-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-L_EsL_ie8XyZlLe_yf9%2Fuploads%2Fgit-blob-fabce855f4ca969bfbd6e18d4a9785e5c357c2da%2Fomics-data-assistant-fullscreen.png?alt=media)

### First-Time Dataset Indexing

The first time Omics Data Assistant is used with a dataset, the dataset must be indexed. This one-time process enables natural language queries by creating vector representations of the dataset structure. Only one person needs to start this indexing process. Subsequent users can query the dataset immediately once indexing is complete.

After indexing starts, it runs in the background. Most datasets complete indexing within 15 minutes. Large datasets like UK Biobank may take over an hour. During indexing, you cannot ask questions until the process completes. You can monitor indexing progress through the assistant interface.

{% hint style="info" %}
Index data is stored securely in the same AWS region as your data to maintain data residency requirements.
{% endhint %}

## Using Omics Data Assistant

### Asking Questions

Omics Data Assistant responds to your questions in plain English and translates them into structured database queries.

Example prompts:

* "Find all patients diagnosed with IBD within 6 months of a diabetes diagnosis"
* "Get patients with lower hemoglobin values than the laboratory's recommended value"
* "Create cohort of all patients with exon loss variants in KIAA1109"

![Asking a question in Omics Data Assistant](https://1612471957-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-L_EsL_ie8XyZlLe_yf9%2Fuploads%2Fgit-blob-14780e52e834d26da4cb17b64747be61469e0bde%2Fomics-data-assistant-exon-loss-patients.png?alt=media)

### Understanding Responses

Each response includes the assistant's thinking process, which shows how it interpreted your question and the SQL queries it generated. You can verify the assistant understood your question correctly, review the SQL queries for accuracy, and learn how natural language translates to database queries.

If your question is unclear, the assistant asks clarifying questions to ensure accurate results.

For each response, you can:

* **Copy responses:** Click **Copy** at the end of any response to copy the markdown-formatted text to your clipboard for use in documents or notes.
* **Provide feedback:** Click the thumbs up or thumbs down buttons on responses to help improve the assistant's accuracy. When giving negative feedback, you can describe the problem in your own words. Your feedback helps the DNAnexus team enhance the assistant for all users.

### Creating Cohorts

The assistant excels at creating cohorts and generating demographic summaries. For complex visualizations, create your cohorts through the assistant, then use Cohort Browser's native [visualization tools](https://documentation.dnanexus.com/user/cohort-browser/creating-visualizations) for detailed analysis.

To create a cohort directly from Omics Data Assistant, phrase your question to specify patient criteria. For example, "Create a cohort of patients with amplifications in HER2".

When the assistant returns the cohort results, click **+ Add to Dashboard** to filter the dataset by the new cohort in Cohort Browser. You can add multiple cohorts from the assistant to your dashboard and [switch between cohorts](https://documentation.dnanexus.com/cohort-browser/defining-cohorts#managing-cohorts).

![Creating cohorts using Omics Data Assistant](https://1612471957-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-L_EsL_ie8XyZlLe_yf9%2Fuploads%2Fgit-blob-d7a1fad6fdc14918d64927a01a41d74b1e25d8bc%2Fomics-data-assistant-create-cohort.png?alt=media)

From Cohort Browser, you can save cohorts to your project as [CohortBrowser records](https://documentation.dnanexus.com/developer/api/introduction-to-data-object-classes/records).

### Analyzing Existing Cohorts

With **✨ Ask About Cohort**, you can open Omics Data Assistant with a specific cohort already loaded as context. You can start asking questions about a population immediately, without re-describing its criteria in the assistant. This works with both saved `CohortBrowser` records and temporary cohorts active on your current dashboard.

1. In Cohort Browser, locate the cohort you want to analyze in the cohort panel.
2. Click **✨ Ask About Cohort**.

The assistant opens with `Context: <Cohort Name>` shown above the chat input, confirming the assistant is focused on that population.

![Omics Data Assistant with cohort context active](https://1612471957-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-L_EsL_ie8XyZlLe_yf9%2Fuploads%2Fgit-blob-cba1c1c3515b8a51d3c81142e3ec7ccb146833e8%2Fomics-data-assistant-context-set.png?alt=media)

{% hint style="info" %}
The assistant holds one cohort in context at a time. Clicking **✨ Ask About Cohort** on a different cohort replaces the current context with the new selection.
{% endhint %}

## Managing Conversations

Your conversation history is stored separately for each dataset. Your conversations remain private to you. Other users cannot access your conversation history.

* To manage your conversations, click the three dots next to a conversation name to either rename or delete the conversation.
* To view and search through your past conversations, click **See All** at the bottom of the panel.

![Managing conversations in Omics Data Assistant](https://1612471957-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-L_EsL_ie8XyZlLe_yf9%2Fuploads%2Fgit-blob-d1d5fdd9ab8b043f648783a0cd6115f269447330%2Fomics-data-assistant-manage-conversations.png?alt=media)
