fusion module

This module uses Arriba to call fusion genes, based on the BAM file produced in the snv-indels modules.

Tools

This module uses the bam file from STAR to call fusion events with Arriba.

The fusion events are filtered based on the blacklist from Arriba itself. Only fusions where at least one of the involved genes is in report_genes will be included in the final output.

For each fusion event that remains after filtering, we also generate a figure using the draw_fusions.R script provided by Arriba.

Input

The input for this module is a single bam file, generated by STAR per sample, specified in a PEP configuration file, as is shown below.

Example input for the expression module

sample_name

bam

SRR8615409

test/data/bam/SRR8615409.snv-indel.bam

Output

The output of this module are a JSON file with an overview of the most important results, as well as a number of other output files: - The final Arriba output file, after filtering. - One figure per fusion event

Configuration

You can automatically generate a configuration for the fusion module using the utilities/create-config.py script.

Example

{
  "genome_fasta": "test/data/reference/hamlet-ref.fa",
  "gtf": "test/data/reference/hamlet-ref.gtf",
  "blacklist": "test/data/reference/arriba/blacklist_hg38_GRCh38_v2.4.0.head.tsv.gz",
  "known_fusions": "test/data/reference/arriba/known_fusions_hg38_GRCh38_v2.4.0.tsv.gz",
  "report_genes": "utilities/deps/small-files/report_genes.txt",
  "cytobands": "utilities/deps/small-files/cytobands_hg38_GRCh38_v2.4.0.tsv",
  "protein_domains": "test/data/reference/arriba/protein_domains_hg38_GRCh38_v2.4.0.gff3"
}

Configuration options

Configuration options

Option

Description

Required

genome_fasta

Reference genome, in FASTA format

yes

gtf

GTF file with transcript information

yes

blacklist

File of blacklisted fusion events

yes

known_fusions

A file of known fusion events

yes

report_genes

Only report fusions involving genes specified in this file

yes

cytobands

A file with cytoband information for visualization

yes

protein_domains

A file with protein domains

yes