
对于工作流来说,Directed acyclic graph,有向非循环图是一个非常不错的展示的策略。
我们可以很直观的看到文件经过怎样的处理,从何种格式,最终转成了何种格式。
首先构建我们的rule:
rule bwa_map:
input:
"data/genome.fa",
"data/samples/{sample}.fastq"
output:
"mapped_reads/{sample}.bam"
shell:
"bwa mem {input} | samtools view -Sb - > {output}"
rule samtools_sort:
input:
"mapped_reads/{sample}.bam"
output:
"sorted_reads/{sample}.bam"
shell:
"samtools sort -T sorted_reads/{wildcards.sample} "
"-O bam {input} > {output}"
rule samtools_index:
input:
"sorted_reads/{sample}.bam"
output:
"sorted_reads/{sample}.bam.bai"
shell:
"samtools index {input}"
以及创立模拟文件:
mkdir -p data/samples
touch data/genome.fa data/samples/{A..D}.fastq
尝试运行 --dag 选项:
snakemake --dag sorted_reads/{A,B}.bam.bai
直接运行会输出一些图像内容文本:
$ snakemake --dag sorted_reads/{A,B}.bam.bai
Building DAG of jobs...
digraph snakemake_dag {
graph[bgcolor=white, margin=0];
node[shape=box, style=rounded, fontname=sans, fontsize=10, penwidth=2];
edge[penwidth=2, color=grey];
0[label = "samtools_index", color = "0.44 0.6 0.85", style="rounded"];
1[label = "samtools_sort", color = "0.22 0.6 0.85", style="rounded"];
2[label = "bwa_map\nsample: A", color = "0.00 0.6 0.85", style="rounded"];
3[label = "samtools_index", color = "0.44 0.6 0.85", style="rounded"];
4[label = "samtools_sort", color = "0.22 0.6 0.85", style="rounded"];
5[label = "bwa_map\nsample: B", color = "0.00 0.6 0.85", style="rounded"];
1 -> 0
2 -> 1
4 -> 3
5 -> 4
}
这里我们可以使用graphviz 的dot 命令将其画出来。
# conda install -y graphviz
snakemake --dag sorted_reads/{A,B}.bam.bai | dot -Tpng > output/dag.png

还是挺小清新的。