# Searching Data Objects

You can use the [`dx ls`](https://documentation.dnanexus.com/helpstrings-of-sdk-command-line-utilities#ls) command to list the objects in your current project. You can determine the current project and folder you are in by using the command [`dx pwd`](https://documentation.dnanexus.com/helpstrings-of-sdk-command-line-utilities#pwd). Using glob patterns, you can broaden your search for objects by specifying filenames with wildcard characters such as `*` and `?`. An asterisk (`*`) represents zero or more characters in a string, and a question mark (`?`) represents exactly one character.

## Searching Objects with Glob Patterns

### Searching Objects in Your Current Folder

By listing objects in your current directory with the wildcard characters `*` and `?`, you can search for objects with a filename using a glob pattern. The examples below use the folder "C. Elegans - Ce10/" in the public project ["Reference Genome Files"](https://platform.dnanexus.com/projects/BQpp3Y804Y0xbyG4GJPQ01xv/) (platform login required to access this link).

#### Printing the Current Working Directory

```shell
$ dx select "Reference Genome Files"
$ dx cd "C. Elegans - Ce10/"
$ dx pwd # Print current working directory
Reference Genome Files:/C. Elegans - Ce10
```

#### Listing Folders and/or Objects in a Folder

```shell
$ dx ls
ce10.bt2-index.tar.gz
ce10.bwa-index.tar.gz
ce10.cw2-index.tar.gz
ce10.fasta.fai
ce10.fasta.gz
ce10.hisat2-index.tar.gz
ce10.star-index.tar.gz
ce10.tmap-index.tar.gz
```

#### Listing Objects Named Using a Pattern

```shell
$ dx ls '*.fa*' # List objects with filenames of the pattern "*.fa*"
ce10.fasta.fai
ce10.fasta.gz
$ dx ls ce10.???-index.tar.gz # List objects with filenames of the pattern "ce10.???-index.tar.gz"
ce10.cw2-index.tar.gz
ce10.bt2-index.tar.gz
ce10.bwa-index.tar.gz
```

### Searching Across Objects in the Current Project

To search the entire project with a filename pattern, use the command `dx find data --name` with the wildcard characters. Unless `--path` or `--all-projects` is specified, `dx find data` searches data under the current project. Below, the command `dx find data` is used in the public project ["Reference Genome Files"](https://platform.dnanexus.com/projects/BQpp3Y804Y0xbyG4GJPQ01xv/) (platform login required to access this link) using the `--name` option to specify the filename of objects that you're searching for.

```shell
$ dx find data --name "*.fa*.gz"
closed  2014-10-09 09:50:51 776.72 MB /M. musculus - mm10/mm10.fasta.gz (file-BQbYQPj0Z05ZzPpb1xf000Xy)
closed  2014-10-09 09:50:30 767.47 MB /M. musculus - mm9/mm9.fasta.gz (file-BQbYK6801fFJ9Fj30kf003PB)
closed  2014-10-09 09:49:27 49.04 MB /D. melanogaster - Dm3/dm3.fasta.gz (file-BQbYVf80yf3J9Fj30kf00PPk)
closed  2014-10-09 09:48:55 29.21 MB /C. Elegans - Ce10/ce10.fasta.gz (file-BQbY9Bj015pB7JJVX0vQ7vj5)
closed  2014-10-08 13:52:26 818.96 MB /H. Sapiens - GRCh37 - hs37d5 (1000 Genomes Phase II)/hs37d5.fa.gz (file-B6ZY7VG2J35Vfvpkj8y0KZ01)
closed  2014-10-08 13:51:31 876.79 MB /H. Sapiens - hg19 (UCSC)/ucsc_hg19.fa.gz (file-B6qq93v2J35fB53gZ5G0007K)
closed  2014-10-08 13:50:53 827.95 MB /H. Sapiens - hg19 (Ion Torrent)/ion_hg19.fa.gz (file-B6ZYPQv2J35xX095VZyQBq2j)
closed  2014-10-08 13:50:17 818.88 MB /H. Sapiens - GRCh38/GRCh38.no_alt_analysis_set.fa.gz (file-BFBv6J80634gkvZ6z100VGpp)
closed  2014-10-08 13:49:53 810.45 MB /H. Sapiens - GRCh37 - b37 (1000 Genomes Phase I)/human_g1k_v37.fa.gz (file-B6ZXxfG2J35Vfvpkj8y0KXF5)
```

### Quoting Wildcards in Shell Commands

When using wildcard characters (`*` and `?`) with `dx` commands, enclose the pattern in single `'` or double `"` quotes. Without quotes, the shell expands the wildcards against files in your local filesystem before passing the pattern to the `dx` command, which produces unexpected results.

Quoting the pattern ensures the shell treats it as a literal string and passes it directly to the `dx` command, where DNAnexus interprets the wildcards to search Platform objects.

```shell
# Correct usage with quotes
dx ls '*.fa*'              # Single quotes prevent shell expansion
dx find data --name "*.gz" # Double quotes also work
```

{% hint style="info" %}
Bash also expands other special characters like `?`, `[`, `]`, `{`, and `}`. For complete details about shell expansion and quoting, see the [Bash manual section on expansions](https://www.gnu.org/software/bash/manual/html_node/Shell-Expansions.html).
{% endhint %}

### Escaping Special Characters

Escape special characters in filenames with a backslash (`\`) when you want to search for them literally. Characters that require escaping include wildcards (`*` and `?`) when you want to find them as literal characters in filenames. You must also escape colons (`:`) and slashes (`/`), because these have special meaning in DNAnexus paths.

Shell behavior affects escaping rules. In many shells, you need to either double-escape (`\\`) or use single quotes to prevent the shell from interpreting the backslash.

The following examples show proper escaping techniques:

```shell
# Searching for a file with colons in the name
dx find data --name "sample\:123.txt"
# Or alternatively with single quotes
dx find data --name 'sample\:123.txt'

# Searching for a file with a literal asterisk
dx find data --name "experiment\*.fastq"
```

## Searching Objects with Other Criteria

`dx find data` also allows you to search data using metadata fields, such as when the data was created, the data tags, or the project the data exists in.

### Searching Objects Created Within a Certain Period of Time

You can use the flags `--created-after` and `--created-before` to search for data objects created within a specific time period.

```shell
$ dx find data --created-after 2017-02-22 --created-before 2017-02-25
closed  2017-02-27 19:14:51 3.90 GB  /H. Sapiens - hg19 (UCSC)/ucsc_hg19.hisat2-index.tar.gz (file-F2pJvF80Vzx54f69K4J8K5xy)
closed  2017-02-27 19:14:21 3.55 GB  /M. musculus - mm10/mm10.hisat2-index.tar.gz (file-F2pJqk00Vq161bzq44Vjvpf5)
closed  2017-02-27 19:13:57 3.51 GB  /M. musculus - mm9/mm9.hisat2-index.tar.gz (file-F2pJpKj0G0JxZxBZ4KJq0Q6B)
closed  2017-02-27 19:13:41 3.85 GB  /H. Sapiens - hg19 (Ion Torrent)/ion_hg19.hisat2-index.tar.gz (file-F2pJkp00BjBk99xz4Jk74V0y)
closed  2017-02-27 19:13:28 3.85 GB  /H. Sapiens - GRCh37 - b37 (1000 Genomes Phase I)/human_g1k_v37.hisat2-index.tar.gz (file-F2pJpy007bGBzj7X446PzxJJ)
closed  2017-02-27 19:13:02 3.90 GB  /H. Sapiens - GRCh37 - hs37d5 (1000 Genomes Phase II)/hs37d5.hisat2-index.tar.gz (file-F2pJpb000vFpzj7X446PzxF0)
closed  2017-02-27 19:12:31 3.91 GB  /H. Sapiens - GRCh38/GRCh38.no_alt_analysis_set.hisat2-index.tar.gz (file-F2pK5y00F8Bp9BYk4KX7Qb4P)
closed  2017-02-27 19:12:18 224.54 MB /D. melanogaster - Dm3/dm3.hisat2-index.tar.gz (file-F2pJP7j0QkbQ3ZqG269589pj)
closed  2017-02-27 19:11:56 139.76 MB /C. Elegans - Ce10/ce10.hisat2-index.tar.gz (file-F2pJK300KKz8bx1126Ky5b3P)
```

### Searching Objects by Their Metadata

You can search for objects based on their metadata. An object's metadata can be set by performing the command [`dx tag`](https://documentation.dnanexus.com/helpstrings-of-sdk-command-line-utilities#tag) or [`dx set_properties`](https://documentation.dnanexus.com/helpstrings-of-sdk-command-line-utilities#set_properties) to respectively tag or setup key-value pairs to describe your data object. You can also set metadata while uploading data to the platform. To search by object tags, use the option `--tag`. This option can be repeated if the search requires multiple tags.

```shell
$ dx find data --tag sampleABC --tag batch123
closed  2017-01-01 09:00:00 6.08 GB  /Input/SRR504516_1.fastq.gz (file-xxxx)
closed  2017-01-01 09:00:00 5.82 GB  /Input/SRR504516_2.fastq.gz (file-wwww)
```

To search by object properties, use the option `--property`. This option can be repeated if the search requires multiple properties.

```shell
$ dx find data --property sequencing_providor=CRO_XYZ
closed  2017-01-01 09:00:00 8.06 GB  /Input/SRR504555_1.fastq.gz (file-qqqq)
closed  2017-01-01 09:00:00 8.52 GB  /Input/SRR504555_2.fastq.gz (file-rrrr)
```

### Searching Objects in Another Project

You can search for an object living in a different project than your current working project by specifying a project and folder path with the flag `--path`. Below, the project ID (project-BQfgzV80bZ46kf6pBGy00J38) of the public project ["Exome Analysis Demo"](https://platform.dnanexus.com/projects/BQfgzV80bZ46kf6pBGy00J38/data/) (platform login required to access this link) is specified as an example.

```shell
$ dx find data --name "*.fastq.gz"
 --path project-BQfgzV80bZ46kf6pBGy00J38:/Input
  closed  2014-10-03 12:04:16 6.08 GB  /Input/SRR504516_1.fastq.gz (file-B40jg7v8KfPy38kjz1vQ001y)
  closed  2014-10-03 12:04:16 5.82 GB  /Input/SRR504516_2.fastq.gz (file-B40jgYG8KfPy38kjz1vQ0020)
```

### Searching Objects Across Projects with VIEW and Above Permissions

To search for data objects in all projects where you have VIEW and above permissions, use the `--all-projects` flag. Public projects are not shown in this search.

```shell
$ dx find data --name "SRR*_1.fastq.gz" --all-projects
closed  2017-01-01 09:00:00 6.08 GB  /Exome Analysis Demo/Input/SRR504516_1.fastq.gz (project-xxxx:file-xxxx)
closed  2017-07-01 10:00:00 343.58 MB /input/SRR064287_1.fastq.gz (project-yyyy:file-yyyy)
closed  2017-01-01 09:00:00 6.08 GB  /data/exome_analysis_demo/SRR504516_1.fastq.gz (project-zzzz:file-xxxx)
```

### Scoping Within Projects

To describe data for small amounts of files (typically below 100), scope `findDataObjects` to only a project level.

The below is an example of code used to scope a project:

```shell
dx api system findDataObjects '{"scope": {"project": "project-xxxx"}, "describe":{"fields":{"state":true}}}'
```

See the [API method `system/findDataObjects`](https://documentation.dnanexus.com/developer/api/search#api-method-system-finddataobjects) for more information about usage.
