# Archiving Files

{% hint style="info" %}
A license is required to use the DNAnexus Archive Service. Contact [DNAnexus Sales](mailto:sales@dnanexus.com) for more information.
{% endhint %}

Archiving in DNAnexus is file-based. You can archive individual files, folders with files, or entire projects' files and save on storage costs. You can also unarchive one or more files, folders, or projects when you need to make the data available for further analyses.

The DNAnexus Archive Service is available via the API in Amazon AWS and Microsoft Azure regions.

## Overview

### File Archival States

To understand the archival life cycle as well as which operations can be performed on files and how billing works, it's helpful to understand the different file states associated with archival. A file in a project can assume one of four archival states:

| Archival states | Details                                                                                                                                                                                      |
| --------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `live`          | The file is in standard storage, such as AWS S3 or Azure Blob.                                                                                                                               |
| `archival`      | Archival requested on the current file, but other copies of the same file are in the `live` state in multiple projects with the same `billTo` entity. The file is still in standard storage. |
| `archived`      | The file is in archival storage, such as AWS S3 Glacier or Azure Blob ARCHIVE.                                                                                                               |
| `unarchiving`   | Restore requested on the current file. The file is in transition from archival storage to standard storage.                                                                                  |

Different states of a file allow different operations to the file. See the table below, for which operations can be performed based on a file's current archival state.

| Archival states | Download | Clone | Compute | Archive | Unarchive            |
| --------------- | -------- | ----- | ------- | ------- | -------------------- |
| `live`          | Yes      | Yes   | Yes     | Yes     | No                   |
| `archival`      | No       | Yes\* | No      | No      | Yes (Cancel archive) |
| `archived`      | No       | Yes   | No      | No      | Yes                  |
| `unarchiving`   | No       | No    | No      | No      | No                   |

\* Clone operation would fail if the object is actively transitioning from `archival` to `archived`.

### File Archival Life Cycle

When the `project-xxxx/archive` API is called on a file object, the file transitions from the `live` state to the `archival` state. Only when all copies of a file in all projects with the same `billTo` organization are in the `archival` state, does the file transition to the `archived` state automatically by the platform.

Likewise, when the `project-xxxx/unarchive` API is called on a file in the `archived` state, the file transitions from the `archived` to the `unarchiving` state. During the `unarchiving` state, the file is being restored by the third-party storage platform, such as AWS or Azure. The `unarchiving` process may take a while depending on the retrieval option selected for the specific platform. Finally, when unarchiving is completed, and the file becomes available on standard storage, the file is transitioned to a `live` state.

![](https://1612471957-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-L_EsL_ie8XyZlLe_yf9%2Fuploads%2Fgit-blob-1c58b1fb1e2620d8e6c481f285bfd0b111e8416b%2Ffile-archival-life-cycle.png?alt=media)

### Archive Service Operations

The File-based Archive Service allows users who have the CONTRIBUTE or ADMINISTER permissions to a project to archive or unarchive files that reside in the project.

Using API, users can archive or unarchive files, folders, or entire projects, although the archiving process itself happens at the file level. The API can accept a list of up to 1000 files for archiving and unarchiving.

When archiving or unarchiving folders or projects, the API by default archives or unarchives all the files at the root level and those in the subfolders recursively. If you archive a folder or a project that includes files in different states, the Service only archives files that are in the `live` state and skips files that are in other states. Likewise, if you unarchive a folder or a project that includes files in different states, the Service only unarchives files that are in the `archived` state, transitions `archival` files back to the `live` state, and skips files in other states.

## Archival Billing

The archival process incurs specific charges, all billed to the billTo organization of the project:

* **Standard storage charge:** The monthly storage charge for files that are located in the standard storage on the platform. The files in the `live` and `archival` state incur this charge. The `archival` state indicates that the file is waiting to be archived or that other copies of the same file in other projects are still in the `live` state, so the file is in standard storage, such as AWS S3. The standard storage charge continues to get billed until all copies of the file are requested to be archived and eventually the file is moved to archival storage and transitioned into the `archived` state.
* **Archival storage charge:** The monthly storage charge for files that are located in archival storage on the platform. Files in the `archived` state incur a monthly archival storage charge.
* **Retrieval fee:** The retrieval fee is a one-time charge at the time of unarchiving based on the volume of data being unarchived.
* **Early retrieval fee:** If you retrieve or delete data from archival storage before the required retention period is met, an early retrieval fee applies. This is 90 days for AWS regions and 180 days for Microsoft Azure regions. You are be charged a pro-rated fee equivalent to the archival storage charges for any remaining days within that period.

## Best Practices

When using the Archive Service, we recommend the following best practices.

* The Archive Service does not work on [sponsored projects](https://documentation.dnanexus.com/getting-started/key-concepts/projects#project-sponsorship). If you want to archive files within a sponsored project, then you must move files into a different project or end the project sponsorship before archival.
* If a file is shared in multiple projects, archiving one copy in one of the projects only transitions the file into the `archival` state, which still incurs the standard storage cost. To achieve the lower archival storage cost, you need to ensure that all copies of the file in all projects with the same `billTo` org are being archived. When all copies of the file reach the `archival` state, the Service moves the files from `archival` to `archived` state. Consider using the `allCopies` option of the API to archive all copies of the file. You must be the org ADMIN of the `billTo` org of the current project to use the `allCopies` option.

  Refer to the following example: The `file-xxxx` has copies in `project-xxxx`, `project-yyyy`, and `project-zzzz` which are sharing the same `billTo` org (`org-xxxx`). You are the `ADMINISTER` of `project-xxxx`, and a `CONTRIBUTE` of `project-yyyy`, but do not have any role in `project-zzzz`. You are the org ADMIN of the project `billTo` org, and try to archive all copies of files in all projects with the same `billTo` org using [/project-xxxx/archive](https://documentation.dnanexus.com/developer/api/data-containers/projects#api-method-class-xxxx-archive):

  1. List all the copies of the file in the `org-xxxx` .

     ```shell
     dx api file-xxxx listProjects '{"archivalInfoForOrg":"org-xxxx"}'
     {
     "project-xxxx": "ADMINISTER",
     "project-yyyy": "CONTRIBUTE",
     "liveProjects": [
      "project-xxxx",
      "project-yyyy",
      "project-zzzz"
     ]
     }
     ```
  2. Force archiving all the copies of `file-xxxx` .

     ```shell
     dx api project-xxxx archive '{"files": ["file-xxxx"], "allCopies": true}'
     {
     "id": "project-xxxx"
     "count": 1
     }
     ```
  3. All copies of `file-xxxx` transition into the `archived` state.
