Align reads to the reference genome, call variants, and calculate quality metrics using Sentieon.
This workflow implements the Sentieon® Genomics software, a set of software tools that perform highly accurate and computationally efficient analysis of genomic data. This workflow performs read alignment, duplicate marking, base quality score recalibration (BQSR), and variant calling steps. The workflow is designed for use with a variety of reference genomes, which are downloaded as part of workflow execution. The workflow also computes quality metrics on the deduplicated alignments and produces various plots which can be used to quickly inspect sample quality.
The workflow can optionally output a gVCF rather than a VCF file, which can be combined with other sample gVCFs for use in joint genotyping.
The workflow can be run using either paired FASTQ or aligned BAM/CRAM files. If using the FASTQ entrypoint,
read_groups must be defined. If using the BAM/CRAM entrypoint,
input_aln_idx must be defined.
R1 fastq files
R2 fastq files
Sample read groups
Input alignment (BAM/CRAM) files
Input alignment (BAM/CRAM) index files
The name of the human reference genome build. (‘hg38_alt’, ‘hg38_gatk’, ‘hg38’, ‘hg38_noalt’, ‘hs38’, ‘b37_gatk’, ‘b37’, ‘hs37d5’, ‘hg19’, ‘ucsc_hg19’)
Output variant calls in the gVCF format instead of VCF [false]
The Sentieon DNAscope variant calling model
Your account’s AWS canonical user ID. Used to acquire a Sentieon license
Sentieon docker image
Number of vCPUs to allocate for the task 
Memory to allocate for the task [64 GiB]
The workflow produces variant calles in either VCF or gVCF format will be produced. Other outputs will depend on the options selected.
Variant calls in VCF or gVCF format
Variant calls index
Metrics and reads files are produced if
Sample recal table output by running BQSR. Base quality score recalibration will run if no custom
The latest version of the Sentieon Docker image can be run by following the instructions listed here.