qc-seq module
The qc-seq module is responsible for removing adapter sequences and low
quality reads, and generating read-level statistics. It also merges the FastQ
files per sample, so they can be used by the other modules. Every set of FastQ
files can be analysed in parallel.
Tools
This module uses cutadapt to remove adapter sequences and low quality bases. Sequali is used to generate detailed quality statistics.
Input
The input for this module is one or more pairs of FastQ files per sample, specified in a PEP configuration file, as is shown below.
sample_name |
R1 |
R2 |
TestSample1 |
test/data/fastq/R1.fq.gz |
test/data/fastq/R2.fq.gz |
TestSample2 |
test/data/fastq/R1.fq.gz |
test/data/fastq/R2.fq.gz |
TestSample2 |
test/data/fastq/SRR8615409 chrM_1.fastq.gz |
test/data/fastq/SRR8615409 chrM_2.fastq.gz |
TestSample3 |
test/data/fastq/R1.fq.gz |
test/data/fastq/R2.fq.gz |
TestSample3 |
test/data/fastq/SRR8615409 chrM_1.fastq.gz |
test/data/fastq/SRR8615409 chrM_2.fastq.gz |
TestSample3 |
test/data/fastq/SRR8615687_flt3_1.fastq.gz |
test/data/fastq/SRR8615687_flt3_2.fastq.gz |
Output
The output of this module are one set of merged FastQ files per sample, as well as a JSON file with statistics.
Configuration
You can automatically generate a configuration for the fusion module using the utilities/create-config.py script.
Example
{
"forward_adapter": "AGATCGGAAGAG",
"reverse_adapter": "AGATCGGAAGAG"
}
Configuration options
The only configurable option for this module is adapter sequences for cutadapt to remove.
Option |
Description |
Required |
|---|---|---|
forward_adapter |
The forward adapter sequence |
yes |
reverse_adapter |
The reverse adapter sequence |
yes |