先下载安装conda,因为它是.sh
文件,直接bash即可,然后
conda create -n wes python=2 bwa
conda info --envs
source activate wes#激活wes环境
conda install sra-tools
conda install samtools
conda install bcftools
conda install vep
conda install snpEFF
conda install multiqc
conda install qualimap
可以一次性安装
conda install -y vep bcftools multiqc
下载gatk4,用迅雷下载比较快。GATK4下载地址 或者直接wget下载(我的速度慢)
wget https://github.com/broadinstitute/gatk/releases/download/4.1.2.0/gatk-4.1.2.0.zip
下载完成或迅雷下载复制到gatk4文件夹,解压
$ unzip gatk-4.1.2.0.zip
$ cd gatk-4.1.2.0
$ ./gatk
注意,需要下载100G的必备数据才能使用GATK。
$ echo 'export PATH=/home/kelly/biosoft/gatk/gatk-4.1.2.0/:$PATH' >>~/.bashrc
$ source ~/.bashrc
$ gatk --help
我是从服务器上下载下来放本地电脑了 下载方式1:直接去gatk官网下载,下载链接为ftp://ftp.broadinstitute.org/bundle/
image.png
下载方式2:也是官网,但通过ftp匿名登录下载
location: ftp.broadinstitute.org/bundle
username: gsapubftp-anonymous
password:
下载后的hg38的bwa_index文件夹内有以下文件:
kelly@DESKTOP-MRA1M1F:/mnt/f/kelly/bioTree/server/wesproject/hg38$ tree -h
.
├── [1.8G] 1000G_phase1.snps.high_confidence.hg38.vcf.gz
├── [2.0M] 1000G_phase1.snps.high_confidence.hg38.vcf.gz.tbi
├── [4.0K] bwa_index
│ ├── [ 20K] gatk_hg38.amb
│ ├── [445K] gatk_hg38.ann
│ ├── [3.0G] gatk_hg38.bwt
│ ├── [767M] gatk_hg38.pac
│ ├── [1.5G] gatk_hg38.sa
│ ├── [6.2K] hg38.bwa_index.log
│ ├── [ 0] index.129248.err
│ ├── [ 0] index.129248.out
│ └── [ 566] run.sh
├── [4.0K] cnv
│ ├── [5.8M] hg38.chr.bed
│ ├── [6.7M] list.interval_list
│ ├── [ 843] readme.txt
│ └── [5.7M] targets.preprocessed.interval.list
├── [3.2G] dbsnp_146.hg38.vcf.gz
├── [3.0M] dbsnp_146.hg38.vcf.gz.tbi
├── [ 59M] hapmap_3.3.hg38.vcf.gz
├── [1.5M] hapmap_3.3.hg38.vcf.gz.tbi
├── [568K] Homo_sapiens_assembly38.dict
├── [3.0G] Homo_sapiens_assembly38.fasta
├── [157K] Homo_sapiens_assembly38.fasta.fai
├── [ 20M] Mills_and_1000G_gold_standard.indels.hg38.vcf.gz
├── [1.4M] Mills_and_1000G_gold_standard.indels.hg38.vcf.gz.tbi
└── [1018] run.sh