Omics Data Catalog
Learn about the Omics Data Catalog API for metadata management, search, and synchronization of structured research data.
A license is required to use the Omics Data Catalog on the DNAnexus Platform. Contact DNAnexus Sales for more information.
The Omics Data Catalog API provides programmatic access to metadata management, search, and synchronization capabilities for structured research data. These APIs complement the standard DNAnexus Platform APIs with specialized endpoints for Omics Data Catalog operations.
Use /system/findDataCatalogs to discover data catalogs available to you.
Data Type Reference
Omics Data Catalog supports specific data types that control which values are accepted, how values are stored and sent through the API, and how they can be used for filtering and searching. Understanding these data types is essential when working with schema definitions, upserting records, and constructing search filters.
For conceptual information about the data catalog data types, see Supported Data Types.
String
max length 255 characters, any characters
"tissue", "mus musculus (mouse)", "RiboFree Total RNA Library Kit"
LongString
max length 10,000 characters, any characters, values longer than 255 characters are truncated when requesting data
Long protocol descriptions, detailed study summaries
ID
min length 1, max length 40 characters, any characters
"Iv3-78", "sample-id-234536", "A1B2C3D4E5F6G7H8I9J0K1L"
Integer
min -9223372036854775808, max 9223372036854775807, must be passed as string to ensure precision
"2", "4712384", "-840000090000"
Decimal
Maximum precision of 20 significant digits, must be passed as string to ensure precision
"-7198.8", "0.0000000012", "5.34E-2", "-0.1e4"
Date
Must be a valid date in YYYY-MM-DD format with values between "0001-01-01" and "9999-12-31"
"2024-01-01", "1999-12-12"
Null Value Handling
Most fields can accept null values when making API calls to /dataCatalog-xxxx/upsertRecords. Pass null as a JSON null value (not the string "null").
Fields that cannot accept null values:
System-generated metadata fields, such as
created_at,file_name,sizePrimary ID fields as defined in
primaryIdFieldfor the entityRequired fields where
isRequiredInIngestionistrue
Field-specific null behavior:
For fields with
allowedValuesdefined: WhenisRequiredInIngestionis not defined orfalse,nullis implicitly allowed even though it's not included in theallowedValuesarray.For optional fields without
allowedValues:nullis accepted and can be used to clear a previously set value.
Omics Data Catalog API Method Specifications
API Method: /dataCatalog-xxxx/describe
/dataCatalog-xxxx/describeSpecification
Gets descriptive information about a specific data catalog.
This method uses the standard DNAnexus Platform API base URL (for example, https://api.dnanexus.com) rather than the data catalog-specific URL.
Inputs
fieldsmapping (optional) Restrict the output of this method to have only the provided keys in this field. If not provided, all fields are returned by default.key — Desired output field. See the Outputs section below for valid values.
value boolean — The value
true.
Outputs
All fields are included by default when fields is not provided. The following fields can be individually disabled using fields:
idstring ID of the data catalog.billTostring ID of the organization to which any costs associated with this data catalog are billed.regionstring The region this data catalog is in, such asaws:us-east-1.namestring The name of the data catalog (typically organization name + region).urlstring Server URL for the data catalog endpoint.membersarray of strings IDs of organizations that have been invited to access this data catalog. Only visible to administrators of thebillToorganization.
Errors
ResourceNotFound
The specified data catalog does not exist.
InvalidInput
Input is not a hash, or
fieldsif present, is not a hash or has a non-boolean key.
Additional standard API errors may be returned.
API Method: /dataCatalog-xxxx/invite
/dataCatalog-xxxx/inviteSpecification
Invites a DNAnexus organization to the data catalog. The invited organization gains access to view, search, update, and delete metadata according to project permissions. If the organization already has access to the data catalog, no change is made.
This method uses the standard DNAnexus Platform API base URL (for example, https://api.dnanexus.com) rather than the data catalog-specific URL.
Inputs
inviteestring (required) The organization to receive access to the data catalog. Must be an org ID.
Outputs
idstring (nullable) Invite ID, ornullif the invite did not need to be created. This happens when the invitee already has access to the data catalog.statestring State of the invite. Always"accepted"because invitations take effect immediately.
Errors
ResourceNotFound
inviteeis not a valid organization ID or is not an existing DNAnexus org.
PermissionDenied
Must be an administrator of the
billToorganization with a full scope token.
Additional standard API errors may be returned.
API Method: /dataCatalog-xxxx/leave
/dataCatalog-xxxx/leaveSpecification
Removes an organization's access to the specified data catalog. The billTo organization cannot leave its own data catalog.
This method uses the standard DNAnexus Platform API base URL (for example, https://api.dnanexus.com) rather than the data catalog-specific URL.
Inputs
organizationstring (required) Organization ID. Removes the organization from the data catalog, revoking all access the organization has to the data catalog.
Outputs
idstring ID of the data catalog from which the organization was removed.
Errors
InvalidInput
The
billToorganization may not leave its own data catalog.
ResourceNotFound
The specified data catalog does not exist.
PermissionDenied
A full scope token is required.
The requesting user must be an administrator of the organization being removed or an administrator of the
billToorganization.
Additional standard API errors may be returned.
API Method: /dataCatalog-xxxx/downloadLoaderTemplates
/dataCatalog-xxxx/downloadLoaderTemplatesSpecification
Returns zipped CSV files that can be used as templates when creating input for metadata ingestion with the Data Catalog Loader app. The content of these CSV files depends on the schema defined for the data catalog.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Inputs
None.
Outputs
Returns a ZIP file containing CSV template files. Each CSV file in the ZIP corresponds to an entity in the schema and contains column headers for all fields of that entity, providing a template for data ingestion.
The HTTP response includes the following HTTP headers:
Content-Type:application/zipContent-Disposition:attachment; filename="<file name>.zip"
Errors
This method may return standard API errors.
API Method: /dataCatalog-xxxx/downloadRecords
/dataCatalog-xxxx/downloadRecordsSpecification
Downloads metadata records from the data catalog as a CSV file based on search criteria. The requesting user must have at least VIEW access to the projects containing the metadata. For public entities (where isPublicInDataCatalog is true in the schema), all records are returned regardless of the requesting user's project access. Projects with the downloadRestricted flag are excluded from the results.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Limitations
The CSV contains up to 250 records.
LongStringvalues above 255 characters are truncated.
Inputs
entitystring (required) The ID of the entity to search.filtersmapping (optional) Specifies the filter criteria that matching records must satisfy. Keys are field IDs obtained from the schema, values are the filter conditions. Can be provided in the following ways:A string to match a field value exactly, for example,
{"/sample/tissue_type": "blood"}.An OR condition requiring the field to match any of the provided values, for example,
{"/sample/status": {"$or": ["active", "processed"]}}.A partial match condition for String, LongString, and ID data types, for example,
{"/participant/name": {"$partialMatch": "john"}}.An OR partial match condition requiring partial match against any provided value, for example,
{"/analysis/tool": {"$orPartialMatch": ["bwa", "bowtie"]}}.An AND partial match condition requiring partial match against all provided values, for example,
{"/sample/description": {"$andPartialMatch": ["tumor", "primary"]}}.Range conditions for Integer, Decimal, Date, and DateTime data types, for example:
Inclusive range with both bounds:
{"/participant/age": {"$from": "18", "$to": "65"}}.Exclusive range with both bounds:
{"/data/file_size": {"$fromExclusive": "1000", "$toExclusive": "10000"}}.Range with only lower bound:
{"/participant/age": {"$from": "18"}}.Range with mixed bounds:
{"/data/file_size": {"$fromExclusive": "1000", "$to": "10000"}}.
Outputs
Returns a CSV file containing the search results with entity fields as columns.
The HTTP response includes the following HTTP headers:
Content-Type:text/csvContent-Disposition:attachment; filename="<file name>.csv"
Errors
InvalidInput
The
filtersparameter contains invalid field IDs or malformed filter conditions.
ResourceNotFound
The
entitydoes not exist in the schema.
Additional standard API errors may be returned.
API Method: /dataCatalog-xxxx/downloadSchema
/dataCatalog-xxxx/downloadSchemaSpecification
Exports the schema following the Data Model Loader's Data Dictionary file format.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Inputs
None.
Outputs
Returns a CSV file containing the schema definition in Data Dictionary format. The CSV file contains the following columns: entity, name, type, display_name, is_system_managed, required_in_ingestion, referenced_entity, description.
The HTTP response includes the following HTTP headers:
Content-Type:text/csvContent-Disposition:attachment; filename="<file name>.csv"
Errors
This method may return standard API errors.
API Method: /dataCatalog-xxxx/findRecords
/dataCatalog-xxxx/findRecordsSpecification
Searches for records in the data catalog that match specified criteria. The requesting user must have at least VIEW access to the projects containing the metadata. For public entities (where isPublicInDataCatalog is true in the schema), all records are returned regardless of the requesting user's project access.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Limitations
LongString fields longer than 255 characters are truncated in results.
Inputs
entitystring (required) The ID of the entity to search.filtersmapping (optional) Specifies the filter criteria that matching records must satisfy. Keys are field IDs obtained from the schema, values are the filter conditions. Can be provided in the following ways:A string to match a field value exactly, for example,
{"/sample/tissue_type": "blood"}.An OR condition requiring the field to match any of the provided values, for example,
{"/sample/status": {"$or": ["active", "processed"]}}.A partial match condition for String, LongString, and ID data types, for example,
{"/participant/name": {"$partialMatch": "john"}}.An OR partial match condition requiring partial match against any provided value, for example,
{"/analysis/tool": {"$orPartialMatch": ["bwa", "bowtie"]}}.An AND partial match condition requiring partial match against all provided values, for example,
{"/sample/description": {"$andPartialMatch": ["tumor", "primary"]}}.Range conditions for Integer, Decimal, Date, and DateTime data types, for example:
Inclusive range with both bounds:
{"/participant/age": {"$from": "18", "$to": "65"}}.Exclusive range with both bounds:
{"/data/file_size": {"$fromExclusive": "1000", "$toExclusive": "10000"}}.Range with only lower bound:
{"/participant/age": {"$from": "18"}}.Range with mixed bounds:
{"/data/file_size": {"$fromExclusive": "1000", "$to": "10000"}}.
limitinteger (optional) Maximum number of results to return per page. Defaults to50. Must be between 1 and 200.startingstring (optional) Pagination token to retrieve subsequent results. The value fromnextin the response of a prior call.
Outputs
resultsarray of mappings The matching records, each containing:internal_idstring The globally unique internal ID of the record.describemapping The found record with field IDs as keys and field values as values. The values can be strings, arrays of strings, ornull, representing the schema field data types.
resultSchemaarray of mappings Description of fields returned in thedescribefield, ordered for display:fieldstring The field ID.aggregationstring The aggregation type applied to field values. Supports"list"(array of values with consistent sorting across fields).
totalResultsinteger (nullable) Total number of results matching the input parameters (may exceed returned results). Returnsnullwhen the count cannot be determined.nextstring (nullable) Pagination token for the next set of results, ornullif no more results are available.previousstring (nullable) Pagination token for the previous set of results, ornullif no prior results exist.
Errors
InvalidInput
The
filtersparameter contains invalid field IDs or malformed filter conditions.The
limitparameter is not between 1 and 200 (inclusive).The
startingparameter is invalid or expired.
ResourceNotFound
The
entitydoes not exist in the schema.
Additional standard API errors may be returned.
API Method: /dataCatalog-xxxx/findRelatedRecords
/dataCatalog-xxxx/findRelatedRecordsSpecification
Gets records that are related to the specified record. For each entity, up to 100 records are returned. The chain of related entities is constructed based on the schema relationships. For public entities (where isPublicInDataCatalog is true in the schema), all records are returned regardless of the requesting user's project access.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Inputs
entitystring (required) The ID of the entity containing the record.internal_idstring (required) The internal ID of the record for which you want to return related entities. Use the/dataCatalog-xxxx/findRecordsAPI method to look up the internal ID of a record.
Outputs
array of mappings List of entities with related records, each containing:
entitystring The entity ID.foundRecordsstring Indicates result completeness.Must be one of
"all","thereAreMore", or"thereMightBeMore":"all"— 100 or fewer records were found. All related records were returned."thereAreMore"— 101 or more records were found. First 100 were returned."thereMightBeMore"— Records were searched based on an entity with limited results. There might be additional records not found due to the limitation.
resultsarray of mappings The related records, each containing:internal_idstring The internal ID of the related record.describemapping The found record with field IDs as keys and field values as values. Includes the fieldsdx_project_id,name, and the primary ID.
Errors
ResourceNotFound
The
entitydoes not exist in the schema.The specified
internal_iddoes not exist in the entity.
Additional standard API errors may be returned.
API Method: /dataCatalog-xxxx/getFilters
/dataCatalog-xxxx/getFiltersSpecification
Gets a list of fields that can be used to filter records of a given entity. Useful when searching with the /dataCatalog-xxxx/findRecords or /dataCatalog-xxxx/downloadRecords API methods.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Inputs
entitystring (required) The ID of the entity to search.
Outputs
fieldsarray of mappings List of fields that can be used in thefiltersparameter when searching the entity.idstring The field ID (use as keys in thefiltersmapping).
Errors
ResourceNotFound
The
entitydoes not exist in the schema.
Additional standard API errors may be returned.
API Method: /dataCatalog-xxxx/getProjectSyncStatus
/dataCatalog-xxxx/getProjectSyncStatusSpecification
Gets the current synchronization status for a project.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Inputs
projectIdstring (required) The ID of the project to check.
Outputs
syncStatestring The current sync state.Must be one of
"SYNC_REQUESTED","SYNCING", or"IDLE".
autoSyncEnabledboolean Whether automatic synchronization is enabled for this project.lastProjectSyncstring The time the last project sync completed, in RFC 3339 format, for example,"2025-04-03T22:01:15.000Z".lastSyncInvokedAtstring The time the last sync was initiated, in RFC 3339 format, for example,"2025-04-03T22:01:00.000Z".
Errors
ResourceNotFound
The specified
projectIddoes not exist or is not associated with this data catalog.
Additional standard API errors may be returned.
API Method: /dataCatalog-xxxx/getSchema
/dataCatalog-xxxx/getSchemaSpecification
Gets a schema defined for a specific data catalog, that is its entities, fields, and relationships.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Inputs
None.
Outputs
entitiesarray of mappings List of entities, each with the following fields:idstring The unique identifier of the entity. Example:"/sample","/analysis".displayNamestring The human-readable name used when referring to a single record of this entity type. Examples:"Sample","Analysis","Participant".displayNamePluralstring The human-readable name used when referring to multiple records of this entity type. Examples:"Samples","Analyses","Participants".descriptionstring A human-readable description of the entity.isPublicInDataCatalogboolean Whentrue, all records of this entity are visible to all users with access to the data catalog, regardless of project permissions. Record IDs for public entities must be unique across the entire data catalog. When not specified orfalse, records follow standard project-based access controls.primaryIdFieldstring The ID of the field that contains the primary identifier for records of this entity. Example:"/sample/sample_id".nameFieldstring The ID of the field that contains the human-readable display name, used together withprimaryIdFieldto describe each record. Example:"/sample/sample_name".fieldsarray of mappings The fields belonging to this entity:idstring The unique identifier of the field across the entire schema. Example:"/analysis/sequencing_method".dataTypestring The data type of the field value. See Schema Field Data Types.displayNamestring The human-readable name of the field. Example:"Sequencing Method".descriptionstring A human-readable description of the field.allowedValuesarray of strings The permitted values for this field. When the field is not required during ingestion (that is,isRequiredInIngestionis not defined orfalse),nullvalue is also allowed but not included in this array.suggestionsarray of strings Suggested values for use as search filters.isHiddenFromResultsTableboolean Whether the field is hidden from result tables in the UI.isDataObjectIdboolean Whether the field contains a DNAnexus data object ID.isDataObjectProjectIdboolean Whether the field contains a DNAnexus data object project ID. Applicable only when the entity is a data object and the field is project ID.isExecutionIdboolean Whether the field contains a DNAnexus execution ID (job or analysis ID).isRequiredInIngestionboolean Whether the field value must be provided when ingesting records.isSystemManagedboolean Whether the field value is managed by the system and cannot be modified by users.linkedFieldmapping Defines a link to another entity:entitystring The ID of the referenced entity, such as"/sample".fieldstring The ID of the referenced field, such as"/sample/sample_id".
Errors
This method may return standard API errors.
API Method: /dataCatalog-xxxx/removeRecords
/dataCatalog-xxxx/removeRecordsSpecification
Removes specified records from the data catalog. If a record with the specified primary ID does not exist, it is skipped, not counted, and does not generate an error.
You can remove multiple records from multiple entities in one request. If some record removals fail due to an error, the other records are still removed. The operation is not atomic.
The requesting user must have at least CONTRIBUTE permission for the project the metadata is associated with. For removing records from protected projects, the requesting user must have ADMINISTER access. For details see, Access Control and Permissions.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Limitations
Maximum 5,000 records or 1MB body size per request.
Inputs
projectIdstring (required) The ID of the project the records are associated with.dryRunboolean (optional) Whether to only validate the inputs without making changes.entitiesmapping (required) The records to be removed, grouped by entity:key — the entity ID.
value mapping — the records to remove for that entity:
idsarray of strings (required) The IDs of records to be removed.
Outputs
Returns a mapping with entity IDs as keys and removal results as values:
key — the entity ID provided in the input.
value mapping — results for each entity:
removedinteger The number of records that were successfully removed (does not include non-existent records).errorsarray of mappings Information about records that failed to be removed:indexinteger The index of the failed record in the input array.messagestring Description of the error.
Errors
InvalidInput
The
projectIdis not a valid project ID.The
entitiesparameter contains invalid entity IDs or malformed record identifiers.
PermissionDenied
The requesting user does not have at least CONTRIBUTE permission for the project.
For protected projects, the requesting user does not have ADMINISTER access.
Additional standard API errors may be returned.
API Method: /dataCatalog-xxxx/syncProject
/dataCatalog-xxxx/syncProjectSpecification
Triggers a sync process for the specified project. After successful sync, all data objects present in the project are reflected as records in the data object entity in the data catalog with updated system-generated metadata.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Inputs
projectIdstring (required) The ID of the project to synchronize.
Outputs
acceptedstring Always contains"ok"when the sync request is successfully accepted (HTTP status code 202).
Errors
InvalidInput
The
projectIdis not a valid project ID.
ResourceNotFound
The specified
projectIddoes not exist or is not associated with this data catalog.
InvalidState
A sync is already in progress for this project.
PermissionDenied
The requesting user does not have permission to sync the project.
Additional standard API errors may be returned.
API Method: /dataCatalog-xxxx/updateProjectSync
/dataCatalog-xxxx/updateProjectSyncSpecification
Updates the automatic synchronization setting for a project. This controls whether the project automatically syncs metadata changes with the Omics Data Catalog. When enabled, automatic synchronization runs every 6 hours.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Inputs
projectIdstring (required) The ID of the project to update.autoSyncEnabledboolean (required) Whether automatic synchronization should be enabled for this project.
Outputs
projectIdstring The ID of the project that was updated.autoSyncEnabledboolean The current status of the automatic synchronization setting.
Errors
InvalidInput
The
projectIdis not a valid project ID.The
autoSyncEnabledparameter is not a boolean value.
ResourceNotFound
The specified
projectIddoes not exist or is not associated with this data catalog.
PermissionDenied
The requesting user does not have permission to modify the project sync settings.
Additional standard API errors may be returned.
API Method: /dataCatalog-xxxx/upsertRecords
/dataCatalog-xxxx/upsertRecordsSpecification
Inserts or updates records in the data catalog. If a record with the same primary ID and project ID already exists, the values are updated. If an existing record is not found, a new record is inserted using the provided ID.
For public entities (where isPublicInDataCatalog is true in the schema), record IDs must be unique across the entire data catalog, not just within a project. Attempting to insert a record with an ID that already exists in a different project results in an error for that record.
You can upsert multiple records into multiple entities in one request. If some record updates fail due to an error, the other records are still inserted or modified. The operation is not atomic.
The requesting user must have at least UPLOAD permission for the project the metadata is associated with. For updating instances with null values, the requesting user must have CONTRIBUTE access in normal projects and ADMINISTER access in protected projects. For details see, Access Control and Permissions.
This API method uses the data catalog URL returned by the /system/findDataCatalogs API method as the base URL.
Limitations
Maximum 5,000 records or 1MB body size per request.
Inputs
projectIdstring (required) The ID of the project the records are associated with.dryRunboolean (optional) Whether to only validate the inputs without making changes.entitiesmapping (required) Specifies the records to be upserted, with entity IDs as keys and arrays of records as values:key — the entity ID.
value array of mappings — records to be inserted or updated for the entity. The primary field ID (as defined in
primaryIdFieldfor the entity) must always be provided to identify the record.<field id>Field values as strings that can be parsed as the field's data type, ornull. If a field is not provided, the field value is not modified.
Outputs
Returns a mapping with entity IDs as keys and processing results as values:
key — the entity ID provided in the input.
value mapping — results for each entity:
okinteger The number of records processed successfully.errorsarray of mappings Information about records that failed to process:indexinteger The index of the failed record in the input array.Only present when the error applies to a specific record. If absent, the error applies to the entire entity.
messagestring Description of the error.
Errors
InvalidInput
The
projectIdis not a valid project ID.The
entitiesparameter contains invalid entity IDs.A field value cannot be parsed as the field's data type.
A required field is missing from a record.
For public entities, a record ID already exists in a different project (IDs must be unique across the catalog).
The request exceeds the maximum of 5,000 records or 1MB body size.
PermissionDenied
The requesting user does not have at least UPLOAD permission for the project.
For updating instances with
nullvalues, the requesting user does not have CONTRIBUTE access in normal projects or ADMINISTER access in protected projects.
Additional standard API errors may be returned.
Last updated
Was this helpful?