Importing Data

How do I upload my data files?

Through the web UI

From the home screen, click on any project to which you have UPLOAD, CONTRIBUTE, or ADMIN access, then click Add Data. Follow instructions in the Add Data dialog.

Through the command line

Download and run the Upload Agent to upload your files. For example, the following command will upload a set of paired-end FASTQ files:

ua sample_1_left.fastq.gz sample_1_right.fastq.gz

How do I upload data from my MiSeq instrument into DNAnexus?

Your MiSeq instrument produces run results in the folder D:\Illumina\MiSeqOutput or on a network path that you specify. After on-instrument secondary analysis is complete, the reads are available in *.fastq.gz files in the subfolder /Data/Intensities/BaseCalls/ of the run folder. For example, the first file may be called D:\Illumina\MiSeqOutput\Run1\Data\Intensities\BaseCalls\sample1_L001_R1_001.fastq.gz.

To upload these files into DNAnexus, follow the instructions in How do I upload my data files?.

How do I work with data from my Ion Torrent PGM instrument into DNAnexus?

If you have *.fastq.gz files from your Ion Torrent PGM instrument, follow the instructions in How do I upload my data files?.

To upload your Ion Torrent flowgram data and map it with TMAP, follow these steps:

  1. Upload your .SFF file(s) with the Upload Agent or the dx uploadtool.

  2. Find the TMAP aligner, available in the Developer Applets public project (platform login required) under Resources on the right side of the home screen. Copy this applet into a project for which you have CONTRIBUTE (or ADMIN) permissions.

  3. Run the applet (click the Run button after selecting the applet in your project) and select the .SFF file as input. The applet expects the reads to be gzip compressed (*.sff.gz) and will produce a SAM file as output.

How do I import reads from the Sequence Read Archive (SRA)?

Visit the European Nucleotide Archive (ENA) and locate there your SRA dataset (sequencing run) of choice. For example, if you search for "SRR001662", you will reach the result page for sequencing run SRR001662. On that page, the "Read Files" panel contains FTP links to fastq files (under the "Fastq files (ftp)" column). In your project, click "Add Data", choose "Transfer from another server" and supply these links.

For example, the "SRR001662" run is contained in these two (download) links: left reads file and right reads file. If you wish to import these files, use these link addresses with the method above.

How do I import a track from the UCSC Genome Browser?

If the track has an associated BED, GTF, or WIG file, use the link to that file as input to the URL Fetcher app (platform login required). This app will launch a job that will download the file and upload it to your DNAnexus project.

For example, the default UCSC Genome Browser view for hg19 includes a track called "Digital DNaseI Hypersensitivity Clusters from ENCODE". If you click on the gray rectangle to the left of the track to configure it, you will find a "downloads" link that leads you to a gzipped BED file with the track data. The link to that file can given as the input to the URL Fetcher app (platform login required).

Not all UCSC Genome Browser tracks include links to downloadable data. However, you can still export most tracks into BED using the UCSC "Table Browser". Select "Table Browser" from the "Tools" menu at the top, and choose the track of interest. In the "output format" select BED, and type in a filename. Click the "Get Output" button to download a BED file.

I get the following error: schannel: failed to setup extended errors when running the Upload Agent on Windows. What can I do?

This is a known issue with Microsoft Windows 7/2008. Please install this hotfix from Microsoft to fix it.

If you still continue to face the problem, please email DNAnexus customer support.