前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >01.GATK肿瘤基因变异最佳实践SnakeMake流程:WorkFlow简介

01.GATK肿瘤基因变异最佳实践SnakeMake流程:WorkFlow简介

原创
作者头像
生信探索
发布2023-05-29 16:07:56
2750
发布2023-05-29 16:07:56
举报
文章被收录于专栏:生信探索生信探索

代码地址

代码语言:text
复制
https://jihulab.com/BioQuest/smkhss
https://github.com/BioQuestX/smkhss

GATK best practices workflow Pipeline summary

SnakeMake workflow for Human Somatic short variants (SNP+INDEL)

Expected fastq inputs

Matched normal and tumor samples.

Reference

  1. Reference genome related files and GTAK budnle files (GATK)
  2. VEP Variarition annotation files (VEP)

Prepare

  1. Adapter trimming (Fastp)
  2. Aligner (BWA mem2)
  3. Mark duplicates (samblaster)
  4. Generates recalibration table for Base Quality Score Recalibration (BaseRecalibrator)
  5. Apply base quality score recalibration (ApplyBQSR)
  6. Merge CRAMs of every sample, repesectly (Picard)
  7. Create CRAM index (samtools)

Quality control report

  1. Fastp report (MultiQC)
  2. Alignment report (MultiQC)

Call

  1. Call somatic SNVs and indels via local assembly of haplotypes (Mutect2)
  2. Tabulates pileup metrics for inferring contamination (GetPileupSummaries)
  3. Calculate the fraction of reads coming from cross-sample contamination (CalculateContamination)
  4. Get the maximum likelihood estimates of artifact prior probabilities in the orientation bias mixture model filter (LearnReadOrientationModel)
  5. Filter somatic SNVs and indels called by Mutect2 (FilterMutectCalls)
  6. Merge all the VCF files (Picard)

Annotation

Annotate variant calls with VEP (VEP)

SnakeMake Report

Outputs

代码语言:text
复制
├── config
│  ├── captured_regions.bed
│  ├── config.yaml
│  └── samples.tsv
├── dag.svg
├── logs
│  ├── annotate
│  ├── call
│  ├── prepare
│  ├── qc
│  ├── ref
│  └── trim
├── raw
│  ├── P1.N.fastq.gz
│  └── P1.T.fastq.gz
├── report
│  ├── fastp_multiqc_data
│  ├── fastp_multiqc.html
│  ├── P1.N.fastp.html
│  ├── P1.N.fastp.json
│  ├── P1.T.fastp.html
│  ├── P1.T.fastp.json
│  ├── prepare_multiqc_data
│  ├── prepare_multiqc.html
│  └── vep_report.html
├── results
│  ├── annotated
│  ├── called
│  ├── prepared
│  └── trimmed
└── workflow
    ├── envs
    ├── report
    ├── rules
    ├── schemas
    ├── scripts
    └── Snakefile

Directed Acyclic Graph

Refrence

代码语言:Python
复制
https://gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels-

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • Expected fastq inputs
  • Reference
  • Prepare
  • Quality control report
  • Call
  • Annotation
  • SnakeMake Report
  • Outputs
  • Directed Acyclic Graph
  • Refrence
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档