Workflow for Illumina Quality Control and Filtering
Version 1

Workflow Type: Common Workflow Language
Stable

Workflow for Illumina Quality Control and Filtering

Multiple paired datasets will be merged into single paired dataset.

Summary:

  • FastQC on raw data files
  • fastp for read quality trimming
  • BBduk for phiX and (optional) rRNA filtering
  • Kraken2 for taxonomic classification of reads (optional)
  • BBmap for (contamination) filtering using given references (optional)
  • FastQC on filtered (merged) data

Other UNLOCK workflows on WorkflowHub: https://workflowhub.eu/projects/16/workflows?view=default

All tool CWL files and other workflows can be found here:
https://gitlab.com/m-unlock/cwl

How to setup and use an UNLOCK workflow:
https://m-unlock.gitlab.io/docs/setup/setup.html

Click and drag the diagram to pan, double click or use the controls to zoom.

Inputs

ID Name Description Type
identifier identifier used Identifier for this dataset used in this workflow
  • string
threads Number of threads Number of threads to use for computational processes
  • int?
memory Maximum memory in MB Maximum memory usage in MegaBytes
  • int?
filter_rrna filter rRNA Optionally remove rRNA sequences from the reads.
  • boolean
forward_reads Forward reads Forward sequence fastq file(s) locally
  • File[]
reverse_reads Reverse reads Reverse sequence fastq file(s) locally
  • File[]
filter_references Filter reference file(s) References fasta file(s) for filtering
  • File[]?
deduplicate Deduplicate reads Remove exact duplicate reads with fastp
  • boolean?
kraken_database Kraken2 database Kraken2 database location, multiple databases is possible
  • Directory[]?
keep_reference_mapped_reads Keep mapped reads Keep with reads mapped to the given reference
  • boolean
step Output Step number Step number for output folder numbering
  • int?
destination Output Destination Optional Output destination used for cwl-prov reporting.
  • string?

Steps

ID Name Description
fastqc_illumina_before FastQC before Quality assessment and report of reads
fastq_merge_fwd Merge forward reads Merge multiple forward fastq reads to a single file
fastq_merge_rev Merge reverse reads Merge multiple reverse fastq reads to a single file
fastp fastp Read quality filtering and (barcode) trimming.
rrna_filter rRNA filter (bbduk) Filters rRNA sequences from reads using bbduk
phix_filter PhiX filter (bbduk) Filters illumina spike-in PhiX sequences from reads using bbduk
illumina_quality_kraken2 Kraken2 Taxonomic classification of FASTQ reads
illumina_quality_kraken2_krona Krona Visualization of Kraken2 classification with Krona
prepare_bbmap_db Prepare references Prepare references to a single fasta file and unique headers
reference_filter_illumina Reference read mapping Map reads against references using BBMap
fastqc_illumina_after FastQC after Quality assessment and report of reads
reports_files_to_folder Reports to folder Preparation of fastp output files to a specific output folder

Outputs

ID Name Description Type
reports_folder Filtering reports folder Folder containing all reports of filtering and quality control
  • Directory
QC_forward_reads Filtered forward read Filtered forward read
  • File
QC_reverse_reads Filtered reverse read Filtered reverse read
  • File

Version History

Version 1 (earliest) Created 21st Apr 2022 at 14:00 by Bart Nijsse

Initial commit


Open master 5c2e0e5
help Creators and Submitter
Discussion Channel
Activity

Views: 2973   Downloads: 326

Created: 21st Apr 2022 at 14:00

Last updated: 7th Apr 2023 at 15:02

help Attributions

None

Total size: 115 KB
Powered by
(v.1.16.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH