The load_fastq_directory function

The load_fastq_directory function is one of the main ways to get data into NGLess. It takes the name of a directory

$ find sample1
sample1/SRR8053346.pair.1.fq.gz
sample1/SRR8053346.pair.2.fq.gz
sample1/SRR8053346.single.fq.gz
sample1/SRR8053355.pair.1.fq.bz2
sample1/SRR8053355.pair.2.fq.bz2

This will return a sample that contains both paired-end and single-end data:

  1. The paired-end dataset sample1/SRR8053346.pair.1.fq.gz - sample1/SRR8053346.pair.2.fq.gz
  2. The paired-end dataset sample1/SRR8053355.pair.1.fq.bz2 - sample1/SRR8053355.pair.2.fq.bz2
  3. The single-end dataset sample1/SRR8053346.single.fq.gz

Currently (as of version 1.4), NGLess supports the following

  • Extensions .gz and .bz2 are handled transparently
  • The extension (prior to the compression extension) must be either .fq or .fastq
  • Before the extension, one of .1/.2 or _1/_2 or _F/_R denotes the paired-end matching

If your data does not conform to these rules, we recommend that you use symlinks to build a directory that does conform to it.

Privacy: Usage of this site follows EMBL’s Privacy Policy. In accordance with that policy, we use Matomo to collect anonymised data on visits to, downloads from, and searches of this site. Contact: bork@embl.de.