itd
The itd module is responsible for finding Internal Tandem Duplications in select genes, specifically FLT3 and KMT2A.
Tools
First, this module uses bwa to align the trimmed reads to a custom reference, which contains the transcript sequence of FLT3 and KMT2A. Next, a custom tool, rose-dt, is used to detect and visualise Internal Tandem Duplications, using evindence from soft-clipped reads.
Input
The input for this module is a single pair of FastQ files per sample, specified in a PEP configuration file, as is shown below.
sample_name |
R1 |
R2 |
SRR8615687 |
test/data/fastq/SRR8615687_flt3_1.fastq.gz |
test/data/fastq/SRR8615687_flt3_2.fastq.gz |
SRR8616218 |
test/data/fastq/SRR8616218_KMT2A_1.fastq.gz |
test/data/fastq/SRR8616218_KMT2A_2.fastq.gz |
Output
The output of this module are a JSON file with an overview of the most important results, as well as a number of other output files:
For both FLT3 and *KMT2A, a .csv file with the detected tandem duplications.
For both FLT3 and KMT2A, a figure to visualise the detected tandem duplications.
Configuration
The configuration for this module is tailored to the provided reference files, be very careful if you want to modify any of these settings. You can automatically generate a configuration for the fusion module using the utilities/create-config.py script.
{
"fasta": "utilities/deps/small-files/itd/itd_genes.fa",
"flt3_name": "FLT3-001",
"flt3_start": 1787,
"flt3_end": 2024,
"kmt2a_name": "KMT2A-213",
"kmt2a_start": 406,
"kmt2a_end": 4769
}
Configuration options
Option |
Description |
Required |
fasta |
The fasta file containing the trasncript sequence for FLT3 and KMT2A |
yes |
flt3_name |
The name of the FLT3 sequence in the fasta file |
yes |
flt3_start |
The start of the FLT3 region to investigate |
yes |
flt3_end |
The end of the FLT3 region to investigate |
yes |
kmt2a_name |
The name of the KMT2A sequence |
yes |
kmt2a_start |
The start of the KMT2A region to investigate |
yes |
kmt2a_end |
The end of the KMT2A region to investigate |
yes |