qc-seq

The qc-seq module is responsible for removing adapter sequences and low quality reads, and generating read-level statistics. It also merges the FastQ files per sample, so they can be used by the other modules. Every set of FastQ files can be analysed in parallel.

Tools

This module uses cutadapt to remove adapter sequences and low quality bases. Sequali is used to generate detailed quality statistics.

Input

The input for this module is one or more pairs of FastQ files per sample, specified in a PEP configuration file, as is shown below.

Example input for the qc-seq module

sample_name

R1

R2

TestSample1

test/data/fastq/R1.fq.gz

test/data/fastq/R2.fq.gz

TestSample2

test/data/fastq/R1.fq.gz

test/data/fastq/R2.fq.gz

TestSample2

test/data/fastq/SRR8615409 chrM_1.fastq.gz

test/data/fastq/SRR8615409 chrM_2.fastq.gz

TestSample3

test/data/fastq/R1.fq.gz

test/data/fastq/R2.fq.gz

TestSample3

test/data/fastq/SRR8615409 chrM_1.fastq.gz

test/data/fastq/SRR8615409 chrM_2.fastq.gz

TestSample3

test/data/fastq/SRR8615687_flt3_1.fastq.gz

test/data/fastq/SRR8615687_flt3_2.fastq.gz

Output

The output of this module are one set of merged FastQ files per sample, as well as a JSON file with statistics.

Configuration

The only configurable option for this module is adapter sequences for cutadapt to remove.

Configuration options :header-rows: 1

Option

Description

Required

forward_adapter

The forward adapter sequence

yes

reverse_adapter

The reverse adapter sequence

yes