Omics Data Assistant

Explore and analyze datasets using natural language queries with Omics Data Assistant, a GenAI-powered interface integrated into Cohort Browser.

circle-info

A license is required to use Omics Data Assistant on the DNAnexus Platform. Contact DNAnexus Salesenvelope for more information.

Omics Data Assistant (the assistant) is a GenAI-powered conversational interface that helps you explore and analyze complex biomedical and clinical datasets using natural language. The assistant is integrated directly into Cohort Browser. This means you can combine conversational queries with powerful visualization tools for comprehensive data analysis.

Whether you're new to a dataset or an experienced bioinformatician, the assistant saves you time by understanding your questions in plain English. New users can quickly discover what data their datasets contain without browsing through fields and schemas. Experienced users can define cohorts in seconds by describing criteria in a few sentences, eliminating the need to manually configure multiple filters.

circle-exclamation

How It Works

Omics Data Assistant uses the latest Anthropic Claude model for natural language understanding and response generation. This model supports large context windows, allowing the assistant to handle complex queries and maintain context throughout conversations.

The assistant accesses only your Apollo dataset and does not connect to the internet or external data sources. Omics Data Assistant is deployed regionally to meet data residency requirements, keeping your data in your region throughout all operations. Conversations are stored securely and remain private to you.

Getting Started

Prerequisites

  • Your organization has active Omics Data Assistant and Cohort Browser licenses.

  • You have access to an Apollo dataset and its associated databases in a project.

Opening Omics Data Assistant

Omics Data Assistant works with datasets in Cohort Browser.

  1. In the DNAnexus Platform, open a dataset in Cohort Browser.

  2. Click ✨ Omics Data Assistant in the lower right corner.

Look for Omics Data Assistant in lower right corner in Cohort Browser

By default, the assistant opens in a panel on the right side. You can enlarge the assistant panel by clicking Enter Full Screen.

In the assistant's input field, you can explore the data by asking questions.

Below the input, you can use two controls to get started quickly:

  • Dataset Overview: Opens an AI-generated overview of the opened dataset. Use this to learn what the dataset contains without writing a prompt. You can still ask follow-up questions for specific details.

  • Help: Opens a guide about Omics Data Assistant, its capabilities, and example prompts to try.

Omics Data Assistant in full screen

First-Time Dataset Indexing

The first time Omics Data Assistant is used with a dataset, the dataset must be indexed. This one-time process enables natural language queries by creating vector representations of the dataset structure. Only one person needs to start this indexing process. Subsequent users can query the dataset immediately once indexing is complete.

After indexing starts, it runs in the background. Most datasets complete indexing within 15 minutes. Large datasets like UK Biobank may take over an hour. During indexing, you cannot ask questions until the process completes. You can monitor indexing progress through the assistant interface.

circle-info

Index data is stored securely in the same AWS region as your data to maintain data residency requirements.

Using Omics Data Assistant

Asking Questions

Omics Data Assistant responds to your questions in plain English and translates them into structured database queries.

Example prompts:

  • "Find all patients diagnosed with IBD within 6 months of a diabetes diagnosis"

  • "Get patients with lower hemoglobin values than the laboratory's recommended value"

  • "Create cohort of all patients with exon loss variants in KIAA1109"

Asking a question in Omics Data Assistant

Understanding Responses

Each response includes the assistant's thinking process, which shows how it interpreted your question and the SQL queries it generated. You can verify the assistant understood your question correctly, review the SQL queries for accuracy, and learn how natural language translates to database queries.

If your question is unclear, the assistant asks clarifying questions to ensure accurate results.

For each response, you can:

  • Copy responses: Click Copy at the end of any response to copy the markdown-formatted text to your clipboard for use in documents or notes.

  • Provide feedback: Click the thumbs up or thumbs down buttons on responses to help improve the assistant's accuracy. When giving negative feedback, you can describe the problem in your own words. Your feedback helps the DNAnexus team enhance the assistant for all users.

Creating Cohorts

The assistant excels at creating cohorts and generating demographic summaries. For complex visualizations, create your cohorts through the assistant, then use Cohort Browser's native visualization tools for detailed analysis.

To create a cohort directly from Omics Data Assistant, phrase your question to specify patient criteria. For example, "Create a cohort of patients with amplifications in HER2".

When the assistant returns the cohort results, click + Add to Dashboard to filter the dataset by the new cohort in Cohort Browser. You can add multiple cohorts from the assistant to your dashboard and switch between cohorts.

Creating cohorts using Omics Data Assistant

From Cohort Browser, you can save cohorts to your project as CohortBrowser records.

Managing Conversations

Your conversation history is stored separately for each dataset. Your conversations remain private to you. Other users cannot access your conversation history.

  • To manage your conversations, click the three dots next to a conversation name to either rename or delete the conversation.

  • To view and search through your past conversations, click See All at the bottom of the panel.

Managing conversations in Omics Data Assistant

Last updated

Was this helpful?