Sentieon Long Read Germline Whole Genome Sequencing Analysis

Perform alignment and variant calling for SNPs, small indels, and structural variants using Sentieon.

Perform alignment and variant calling for SNPs, small indels, and structural variants using Sentieon.

This workflow implements the Sentieon® Genomics software, a set of software tools that perform highly accurate and computationally efficient analysis of genomic data. This workflow performs read alignment, SNP and small indel, and structural variant calling steps on long read data. The workflow is designed for use with a variety of reference genomes, which are downloaded as part of workflow execution.

The workflow outputs both small and structural variant calls.

This workflow was developed by the Sentieon development and is written in Workflow Description Language (WDL). Further documentation can be found here.

Sentieon Long Read Germline Whole Genome Sequencing Analysis workflow diagram
Sentieon Long Read Germline Whole Genome Sequencing Analysis workflow diagram

Workflow Inputs

The long read pipeline is capable of processing either PacBio or Oxford Nanopore long reads. Either read type may be used as the fastq input.

InputDescription
fastq

Long read FASTQ files, generated by either PacBio or Oxford Nanopore sequencers

read_groups

Sample read groups

reference_name

The name of the human reference genome build. (‘hg38_alt’, ‘hg38_gatk’, ‘hg38’, ‘hg38_noalt’, ‘hs38’, ‘b37_gatk’, ‘b37’, ‘hs37d5’, ‘hg19’, ‘ucsc_hg19’)

dnascope_lr_model

The DNAscope LongRead model file. If provided, small variant calls will be output

longreadsv_model

The LongReadSV model file

canonical_user_id

Your account’s AWS canonical user ID. Used to acquire a Sentieon license

sentieon_docker

Sentieon docker image

n_threads

Number of vCPUs to allocate for the task [32]

memory

Memory to allocate for the task [64 GiB]

Workflow Outputs

The Sentieon long read germline pipeline outputs aligned reads and a long read structural variant VCF. If a DNAscope long read model is provided, it will also output a VCF containing SNPs and small indels.

OutputDescription
aligned_reads

Aligned and duplicate marked reads

aligned_index

Index for aligned_reads

sv_vcf

Long read structural variant VCF

sv_vcf_tbi

Index for sv_vcf

calls_vcf

SNP and small indel call VCF; output if dnascope_lr_model is provided

calls_vcf_tbi

Index for calls_vcf; output if dnascope_lr_model is provided

Containers

The latest version of the Sentieon Docker image can be run by following the instructions listed here.

Top