# Scatter Plot

{% hint style="info" %}
An Apollo license is required to use Cohort Browser on the DNAnexus Platform. Org approval may also be required. [Contact DNAnexus Sales](mailto:sales@dnanexus.com) for more information.
{% endhint %}

## When to Use Scatter Plots

Scatter plots can be used to compare the distribution of values in a field containing numerical data, across different groups in a cohort. In a scatter plot, each such group is defined by its members sharing the same value in another field that also contains numerical data.

Primary field values are plotted on the *x* axis. Secondary field values are plotted on the *y* axis.

| Supported Data Types                     |                                          |
| ---------------------------------------- | ---------------------------------------- |
| Primary Field                            | Secondary Field                          |
| Numerical (Integer) or Numerical (Float) | Numerical (Integer) or Numerical (Float) |

## Using Scatter Plots in the Cohort Browser

In the scatter plot below, each dot represents a particular combination of values, found in one or more records in a cohort, in fields *Insurance Billed* and *Cost.* The lighter the dot at a particular point, the fewer the records that share that combination. Darker dots, meanwhile, indicate that more records share a particular combination.

![Scatter Plot: Insurance Billed x Cost](https://1612471957-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-L_EsL_ie8XyZlLe_yf9%2Fuploads%2Fgit-blob-a5f37db15577f1c22b256c34053f3bc455d0379d%2Fimage.png?alt=media)

### Non-Numeric Data in Scatter Plots

Fields containing primarily numeric data may also include non-numeric values. These non-numeric values cannot be represented in a scatter plot. The message "This field contains non-numeric values" appears below the scatter plot, as in this sample chart:

![Scatter Plot Based on Field or Fields Containing Non-Numeric Values](https://1612471957-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-L_EsL_ie8XyZlLe_yf9%2Fuploads%2Fgit-blob-177a25354b7de36e97f690495331253334781537%2Fimage.png?alt=media)

Clicking the "non-numeric values" link displays detail on those values, and the number of record in which each appears.

![Detail on Non-Numeric Values](https://1612471957-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-L_EsL_ie8XyZlLe_yf9%2Fuploads%2Fgit-blob-a50f40eb7f5f59bb736317043ce58e8319de6163%2Fimage.png?alt=media)

### Limit on Number of Data Points

In the Cohort Browser, scatter plots can show up to 30,000 distinct data points. If you create a scatter plot that would require that more data points be shown, you see this message above the chart:

![Scatter Plot with Warning Message about Data Point Limit](https://1612471957-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-L_EsL_ie8XyZlLe_yf9%2Fuploads%2Fgit-blob-6a8470c2473493b61eeebf502bee690d0bb4887f%2Fimage.png?alt=media)

In this scenario, [add a cohort filter](https://documentation.dnanexus.com/user/defining-cohorts#defining-cohort-criteria) to generate a scatter plot that shows data for all the members of a cohort.

### Cohort Compare

Scatter plots are not supported in Cohort Compare.

## Preparing Data for Visualization in Scatter Plots

When [ingesting data using Data Model Loader](https://documentation.dnanexus.com/developer/ingesting-data/data-model-loader/ingestion-data-types), the following data types can be visualized in scatter plots:

* Integer
* Integer Sparse
* Float
* Float Sparse
