Omics Data Assistant
Explore and analyze datasets using natural language queries with Omics Data Assistant, a GenAI-powered interface integrated into Cohort Browser.
A license is required to use Omics Data Assistant on the DNAnexus Platform. Contact DNAnexus Sales for more information.
Use Omics Data Assistant (the assistant) to explore and analyze complex biomedical and clinical datasets using natural language. The assistant is integrated directly into Cohort Browser. This means you can combine conversational queries with powerful visualization tools for comprehensive data analysis.
Whether you're new to a dataset or an experienced bioinformatician, the assistant saves you time by understanding your questions in plain English. New users can discover what data their datasets contain without browsing through fields and schemas. Experienced users can define cohorts in seconds by describing criteria in a few sentences, removing the need to manually configure multiple filters.
Omics Data Assistant uses generative AI to accelerate your analysis. While powerful, AI models can occasionally produce inaccurate or incomplete results. Always verify generated cohorts and insights against your underlying data. The assistant alone should not be used for clinical diagnosis or treatment decisions.
How It Works
Omics Data Assistant uses the latest Anthropic Claude model for natural language understanding and response generation. This model supports large context windows, allowing the assistant to handle complex queries and maintain context throughout conversations.
The assistant accesses only your Apollo dataset and does not connect to the internet or external data sources. Omics Data Assistant is deployed regionally to meet data residency requirements, keeping your data in your region throughout all operations. Conversations are stored securely and remain private to you.
Getting Started
Prerequisites
Your organization has active Omics Data Assistant and Cohort Browser licenses.
You have access to an Apollo dataset and its associated databases in a project.
Opening Omics Data Assistant
Omics Data Assistant works with datasets in Cohort Browser.
In the DNAnexus Platform, open a dataset in Cohort Browser.
Open the assistant using one of these methods:
Click ✨ Omics Data Assistant in the lower right corner to query the entire dataset.
Click ✨ Ask About Cohort in the cohort panel to open the assistant with a specific cohort already in context.

By default, the assistant opens in a panel on the right side. You can enlarge the assistant panel by clicking Enter Full Screen.
In the assistant's input field, you can explore the data by asking questions.
Below the input, you can use two controls to get started:
Dataset Overview: Opens an AI-generated overview of the opened dataset. Use this to learn what the dataset contains without writing a prompt. You can still ask follow-up questions for specific details.
Help: Opens a guide about Omics Data Assistant, its capabilities, and example prompts to try.

First-Time Dataset Indexing
The first time Omics Data Assistant is used with a dataset, the dataset must be indexed. This one-time process enables natural language queries by creating vector representations of the dataset structure. Only one person needs to start this indexing process. Subsequent users can query the dataset immediately once indexing is complete.
After indexing starts, it runs in the background. Most datasets complete indexing within 15 minutes. Large datasets like UK Biobank may take over an hour. During indexing, you cannot ask questions until the process completes. You can monitor indexing progress through the assistant interface.
Index data is stored securely in the same AWS region as your data to maintain data residency requirements.
Using Omics Data Assistant
Asking Questions
Omics Data Assistant responds to your questions in plain English and translates them into structured database queries.
Example prompts:
"Find all patients diagnosed with IBD within 6 months of a diabetes diagnosis"
"Get patients with lower hemoglobin values than the laboratory's recommended value"
"Create a cohort of all patients with exon loss variants in KIAA1109"

Understanding Responses
Each response includes the assistant's thinking process, which shows how it interpreted your question and the SQL queries it generated. You can verify the assistant understood your question correctly, review the SQL queries for accuracy, and learn how natural language translates to database queries.
If your question is unclear, the assistant asks clarifying questions to ensure accurate results.
For each response, you can:
Copy responses: Click Copy at the end of any response to copy the markdown-formatted text to your clipboard for use in documents or notes.
Provide feedback: Click the thumbs up or thumbs down buttons on responses to help improve the assistant's accuracy. When giving negative feedback, you can describe the problem in your own words. Your feedback helps the DNAnexus team enhance the assistant for all users.
Charts and Visualizations
When visual output better communicates the results of a query, the assistant generates a chart alongside or instead of tabular data. Chart types are chosen automatically to match your data and question.
Charts are interactive. Hover over any bar, point, or data element to see exact values. The assistant starts with a sensible default chart, and you can ask for additional changes in natural language.
You can request broad chart customization, including:
Swap axes, use a log scale, set axis ranges, or sort values.
Change colors, use conditional colors, switch mark types, or adjust marker size and opacity.
Update chart titles, axis labels, value formats, tooltips, and legend placement.
Filter a subset, change aggregation, or adjust binning for histograms.
Example prompts:
"Make this a horizontal bar chart and sort from highest to lowest."
"Use a log scale on the Y-axis and start it at zero."
"Color bars by diagnosis category using shades of blue."
"Set values below 0 to red and values above 0 to green."
"Rename the title to Variant burden by gene and show percentages on the Y-axis."
"Only show data for 2023 and group results by month."
"Show this distribution as a histogram with 10-unit bins."
Some visualization requests are not supported. If a request does not work, rephrase it with simpler instructions.
Creating Cohorts
The assistant excels at creating cohorts and generating demographic summaries. For complex visualizations, create your cohorts through the assistant, then use Cohort Browser's native visualization tools for detailed analysis.
To create a cohort directly from Omics Data Assistant, phrase your question to specify patient criteria. For example, "Create a cohort of patients with amplifications in HER2".
When the assistant returns the cohort results, click + Add to Dashboard to filter the dataset by the new cohort in Cohort Browser. You can add multiple cohorts from the assistant to your dashboard and switch between cohorts.

From Cohort Browser, you can save cohorts to your project as CohortBrowser records.
Analyzing Existing Cohorts
With ✨ Ask About Cohort, you can open Omics Data Assistant with a specific cohort already loaded as context. You can start asking questions about a population immediately, without re-describing its criteria in the assistant. This works with both saved CohortBrowser records and temporary cohorts active on your current dashboard.
In Cohort Browser, locate the cohort you want to analyze in the cohort panel.
Click ✨ Ask About Cohort.
The assistant opens with Context: <Cohort Name> shown above the chat input, confirming the assistant is focused on that population.

The assistant holds one cohort in context at a time. Clicking ✨ Ask About Cohort on a different cohort replaces the current context with the new selection.
Managing Conversations
Your conversation history is stored for each dataset. Your conversations remain private to you. Other users cannot access your conversation history.
To manage your conversations, click the three dots next to a conversation name to either rename or delete the conversation.
To view and search through your past conversations, click See All at the bottom of the panel.

Last updated
Was this helpful?