# Searching Data Objects

You can use the [`dx ls`](/user/helpstrings-of-sdk-command-line-utilities.md#ls) command to list the objects in your current project. You can determine the current project and folder you are in by using the command [`dx pwd`](/user/helpstrings-of-sdk-command-line-utilities.md#pwd). Using glob patterns, you can broaden your search for objects by specifying filenames with wildcard characters such as `*` and `?`. An asterisk (`*`) represents zero or more characters in a string, and a question mark (`?`) represents exactly one character.

## Searching Objects with Glob Patterns

### Searching Objects in Your Current Folder

By listing objects in your current directory with the wildcard characters `*` and `?`, you can search for objects with a filename using a glob pattern. The examples below use the folder "C. Elegans - Ce10/" in the public project ["Reference Genome Files"](https://platform.dnanexus.com/projects/BQpp3Y804Y0xbyG4GJPQ01xv/) (platform login required to access this link).

#### Printing the Current Working Directory

```shell
$ dx select "Reference Genome Files"
$ dx cd "C. Elegans - Ce10/"
$ dx pwd # Print current working directory
Reference Genome Files:/C. Elegans - Ce10
```

#### Listing Folders and/or Objects in a Folder

```shell
$ dx ls
ce10.bt2-index.tar.gz
ce10.bwa-index.tar.gz
ce10.cw2-index.tar.gz
ce10.fasta.fai
ce10.fasta.gz
ce10.hisat2-index.tar.gz
ce10.star-index.tar.gz
ce10.tmap-index.tar.gz
```

#### Listing Objects Named Using a Pattern

```shell
$ dx ls '*.fa*' # List objects with filenames of the pattern "*.fa*"
ce10.fasta.fai
ce10.fasta.gz
$ dx ls ce10.???-index.tar.gz # List objects with filenames of the pattern "ce10.???-index.tar.gz"
ce10.cw2-index.tar.gz
ce10.bt2-index.tar.gz
ce10.bwa-index.tar.gz
```

### Searching Across Objects in the Current Project

To search the entire project with a filename pattern, use the command `dx find data --name` with the wildcard characters. Unless `--path` or `--all-projects` is specified, `dx find data` searches data under the current project. Below, the command `dx find data` is used in the public project ["Reference Genome Files"](https://platform.dnanexus.com/projects/BQpp3Y804Y0xbyG4GJPQ01xv/) (platform login required to access this link) using the `--name` option to specify the filename of objects that you're searching for.

```shell
$ dx find data --name "*.fa*.gz"
closed  2014-10-09 09:50:51 776.72 MB /M. musculus - mm10/mm10.fasta.gz (file-BQbYQPj0Z05ZzPpb1xf000Xy)
closed  2014-10-09 09:50:30 767.47 MB /M. musculus - mm9/mm9.fasta.gz (file-BQbYK6801fFJ9Fj30kf003PB)
closed  2014-10-09 09:49:27 49.04 MB /D. melanogaster - Dm3/dm3.fasta.gz (file-BQbYVf80yf3J9Fj30kf00PPk)
closed  2014-10-09 09:48:55 29.21 MB /C. Elegans - Ce10/ce10.fasta.gz (file-BQbY9Bj015pB7JJVX0vQ7vj5)
closed  2014-10-08 13:52:26 818.96 MB /H. Sapiens - GRCh37 - hs37d5 (1000 Genomes Phase II)/hs37d5.fa.gz (file-B6ZY7VG2J35Vfvpkj8y0KZ01)
closed  2014-10-08 13:51:31 876.79 MB /H. Sapiens - hg19 (UCSC)/ucsc_hg19.fa.gz (file-B6qq93v2J35fB53gZ5G0007K)
closed  2014-10-08 13:50:53 827.95 MB /H. Sapiens - hg19 (Ion Torrent)/ion_hg19.fa.gz (file-B6ZYPQv2J35xX095VZyQBq2j)
closed  2014-10-08 13:50:17 818.88 MB /H. Sapiens - GRCh38/GRCh38.no_alt_analysis_set.fa.gz (file-BFBv6J80634gkvZ6z100VGpp)
closed  2014-10-08 13:49:53 810.45 MB /H. Sapiens - GRCh37 - b37 (1000 Genomes Phase I)/human_g1k_v37.fa.gz (file-B6ZXxfG2J35Vfvpkj8y0KXF5)
```

### Quoting Wildcards in Shell Commands

When using wildcard characters (`*` and `?`) with `dx` commands, enclose the pattern in single `'` or double `"` quotes. Without quotes, the shell expands the wildcards against files in your local filesystem before passing the pattern to the `dx` command, which produces unexpected results.

Quoting the pattern ensures the shell treats it as a literal string and passes it directly to the `dx` command, where DNAnexus interprets the wildcards to search Platform objects.

```shell
# Correct usage with quotes
dx ls '*.fa*'              # Single quotes prevent shell expansion
dx find data --name "*.gz" # Double quotes also work
```

{% hint style="info" %}
Bash also expands other special characters like `?`, `[`, `]`, `{`, and `}`. For complete details about shell expansion and quoting, see the [Bash manual section on expansions](https://www.gnu.org/software/bash/manual/html_node/Shell-Expansions.html).
{% endhint %}

### Escaping Special Characters

Escape special characters in filenames with a backslash (`\`) when you want to search for them literally. Characters that require escaping include wildcards (`*` and `?`) when you want to find them as literal characters in filenames. You must also escape colons (`:`) and slashes (`/`), because these have special meaning in DNAnexus paths.

Shell behavior affects escaping rules. In many shells, you need to either double-escape (`\\`) or use single quotes to prevent the shell from interpreting the backslash.

The following examples show proper escaping techniques:

```shell
# Searching for a file with colons in the name
dx find data --name "sample\:123.txt"
# Or alternatively with single quotes
dx find data --name 'sample\:123.txt'

# Searching for a file with a literal asterisk
dx find data --name "experiment\*.fastq"
```

## Searching Objects with Other Criteria

`dx find data` also allows you to search data using metadata fields, such as when the data was created, the data tags, or the project the data exists in.

### Searching Objects Created Within a Certain Period of Time

You can use the flags `--created-after` and `--created-before` to search for data objects created within a specific time period.

```shell
$ dx find data --created-after 2017-02-22 --created-before 2017-02-28
closed  2017-02-27 19:14:51 3.90 GB  /H. Sapiens - hg19 (UCSC)/ucsc_hg19.hisat2-index.tar.gz (file-F2pJvF80Vzx54f69K4J8K5xy)
closed  2017-02-27 19:14:21 3.55 GB  /M. musculus - mm10/mm10.hisat2-index.tar.gz (file-F2pJqk00Vq161bzq44Vjvpf5)
closed  2017-02-27 19:13:57 3.51 GB  /M. musculus - mm9/mm9.hisat2-index.tar.gz (file-F2pJpKj0G0JxZxBZ4KJq0Q6B)
closed  2017-02-27 19:13:41 3.85 GB  /H. Sapiens - hg19 (Ion Torrent)/ion_hg19.hisat2-index.tar.gz (file-F2pJkp00BjBk99xz4Jk74V0y)
closed  2017-02-27 19:13:28 3.85 GB  /H. Sapiens - GRCh37 - b37 (1000 Genomes Phase I)/human_g1k_v37.hisat2-index.tar.gz (file-F2pJpy007bGBzj7X446PzxJJ)
closed  2017-02-27 19:13:02 3.90 GB  /H. Sapiens - GRCh37 - hs37d5 (1000 Genomes Phase II)/hs37d5.hisat2-index.tar.gz (file-F2pJpb000vFpzj7X446PzxF0)
closed  2017-02-27 19:12:31 3.91 GB  /H. Sapiens - GRCh38/GRCh38.no_alt_analysis_set.hisat2-index.tar.gz (file-F2pK5y00F8Bp9BYk4KX7Qb4P)
closed  2017-02-27 19:12:18 224.54 MB /D. melanogaster - Dm3/dm3.hisat2-index.tar.gz (file-F2pJP7j0QkbQ3ZqG269589pj)
closed  2017-02-27 19:11:56 139.76 MB /C. Elegans - Ce10/ce10.hisat2-index.tar.gz (file-F2pJK300KKz8bx1126Ky5b3P)
```

### Searching Objects by Their Metadata

You can search for objects based on their metadata. An object's metadata can be set by performing the command [`dx tag`](/user/helpstrings-of-sdk-command-line-utilities.md#tag) or [`dx set_properties`](/user/helpstrings-of-sdk-command-line-utilities.md#set_properties) to respectively tag or set up key-value pairs to describe your data object. You can also set metadata while uploading data to the platform. To search by object tags, use the option `--tag`. This option can be repeated if the search requires multiple tags.

```shell
$ dx find data --tag sampleABC --tag batch123
closed  2017-01-01 09:00:00 6.08 GB  /Input/SRR504516_1.fastq.gz (file-xxxx)
closed  2017-01-01 09:00:00 5.82 GB  /Input/SRR504516_2.fastq.gz (file-wwww)
```

To search by object properties, use the option `--property`. This option can be repeated if the search requires multiple properties.

```shell
$ dx find data --property sequencing_providor=CRO_XYZ
closed  2017-01-01 09:00:00 8.06 GB  /Input/SRR504555_1.fastq.gz (file-qqqq)
closed  2017-01-01 09:00:00 8.52 GB  /Input/SRR504555_2.fastq.gz (file-rrrr)
```

### Searching Objects in Another Project

You can search for an object living in a different project than your current working project by specifying a project and folder path with the flag `--path`. Below, the project ID (project-BQfgzV80bZ46kf6pBGy00J38) of the public project ["Exome Analysis Demo"](https://platform.dnanexus.com/projects/BQfgzV80bZ46kf6pBGy00J38/data/) (platform login required to access this link) is specified as an example.

```shell
$ dx find data --name "*.fastq.gz" \
 --path project-BQfgzV80bZ46kf6pBGy00J38:/Input
  closed  2014-10-03 12:04:16 6.08 GB  /Input/SRR504516_1.fastq.gz (file-B40jg7v8KfPy38kjz1vQ001y)
  closed  2014-10-03 12:04:16 5.82 GB  /Input/SRR504516_2.fastq.gz (file-B40jgYG8KfPy38kjz1vQ0020)
```

### Searching Objects Across Projects with VIEW and Above Permissions

To search for data objects in all projects where you have VIEW and above permissions, use the `--all-projects` flag. Public projects are not shown in this search.

```shell
$ dx find data --name "SRR*_1.fastq.gz" --all-projects
closed  2017-01-01 09:00:00 6.08 GB  /Exome Analysis Demo/Input/SRR504516_1.fastq.gz (project-xxxx:file-xxxx)
closed  2017-07-01 10:00:00 343.58 MB /input/SRR064287_1.fastq.gz (project-yyyy:file-yyyy)
closed  2017-01-01 09:00:00 6.08 GB  /data/exome_analysis_demo/SRR504516_1.fastq.gz (project-zzzz:file-xxxx)
```

### Scoping Within Projects

To describe small numbers of files (typically fewer than 100), scope `findDataObjects` only to the project level.

The below is an example of code used to scope a project:

```shell
dx api system findDataObjects '{"scope": {"project": "project-xxxx"}, "describe":{"fields":{"state":true}}}'
```

See the [API method `system/findDataObjects`](/developer/api/search.md#api-method-system-finddataobjects) for more information about usage.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.dnanexus.com/user/objects/searching-data-objects.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
