前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >根据vcf文件计算群体间Fst;构建进化树;网络图;PCA

根据vcf文件计算群体间Fst;构建进化树;网络图;PCA

作者头像
用户7010445
发布2020-03-03 15:00:11
5.9K1
发布2020-03-03 15:00:11
举报

Fst:群体间固定系数(Fixation index),用来衡量种群分化程度,取值从0到1,为0则认为两个种群间是随机交配的,基因型完全相似;为1则表示是完全隔离的,完全不相似。其是一种以哈迪温伯格定律为前提的种群遗传学统计方法。

  • Fst详解(具体计算步骤)
  • 使用vcftools或者gcta计算群体间固定指数(Fixation index,FST)

本文使用的示例文件是 文献笔记四十五:基于全基因组重测序技术的中国猕猴桃溃疡病菌遗传多样性分析 文章中提到的vcf 文件

使用R语言的```hierfstat```包计算
代码语言:javascript
复制
library(vcfR)
library(adegenet)
library(hierfstat)
kiwipang<-read.vcfR("KiwifruitPathogenFiltered.recode.vcf")
df<-vcfR2genind(kiwipang)
ploidy(df)<-1
pop.Kiwipang<-read.table("KiwifruitPathogen.indv",sep="\t",header=F)
all(colnames(kiwipang@gt)[-1] == pop.Kiwipang$V1)
pop(df)<-pop.Kiwipang$V3
df$pop
mydf<-genind2hierfstat(df)
basic.stats(mydf)
popFst1<-genet.dist(mydf,diploid=F,method = "Ds")
popFst<-as.matrix(popFst1)
colnames(popFst)<-c("Chongqing","Guizhou","Hunan",
                    "Hubei","Henan","Sichuan","Shaanxi")
rownames(popFst)<-c("Chongqing","Guizhou","Hunan",
                    "Hubei","Henan","Sichuan","Shaanxi")
pheatmap::pheatmap(popFst,cluster_rows = F,cluster_cols = F)

image.png

基于距离的进化树
代码语言:javascript
复制
gl.rubi<-vcfR2genlight(kiwipang)
ploidy(gl.rubi)<-1
pop(gl.rubi)<-pop.Kiwipang$V3
library(poppr)
library(RColorBrewer)
library(ape)
tree<-poppr::aboot(gl.rubi,tree="upgma",distance=bitwise.dist, 
            sample=100,showtree=F,cutoff = 50,quiet=T)
cols <- brewer.pal(n = nPop(gl.rubi), name = "Dark2")
plot.phylo(tree, cex = 0.8, font = 2, adj = 0, 
           tip.color =  cols[pop(gl.rubi)])
nodelabels(tree$node.label, adj = c(1.3, -0.5), 
           frame = "n", cex = 0.8,font = 3, xpd = TRUE)
legend('topleft', legend = c("Chongqing","Guizhou","Hunan",
                             "Hubei","Henan","Sichuan","Shaanxi"),
       fill = cols, border = FALSE, bty = "n", cex = 0.7)
axis(side = 1)
title(xlab = "Genetic distance (proportion of loci that are different)")

image.png

基于距离的网络图
代码语言:javascript
复制
library(igraph)
rubi.dist <- bitwise.dist(gl.rubi)
rubi.msn <- poppr.msn(gl.rubi, rubi.dist, showplot = FALSE, include.ties = T)
node.size <- rep(2, times = nInd(gl.rubi))
names(node.size) <- indNames(gl.rubi)
vertex.attributes(rubi.msn$graph)$size <- node.size
plot_poppr_msn(gl.rubi, rubi.msn , palette = brewer.pal(n = nPop(gl.rubi), name = "Dark2"), gadj = 70)

image.png

主成分分析

代码语言:javascript
复制
rubi.pca <- glPca(gl.rubi, nf = 3)
barplot(100*rubi.pca$eig/sum(rubi.pca$eig), col = heat.colors(50), main="PCA Eigenvalues")
title(ylab="Percent of variance\nexplained", line = 2)
title(xlab="Eigenvalues", line = 1)
rubi.pca.scores <- as.data.frame(rubi.pca$scores)
rubi.pca.scores$pop <- pop(gl.rubi)
library(ggplot2)
set.seed(9)
p <- ggplot(rubi.pca.scores, aes(x=PC1, y=PC2, colour=pop)) 
p <- p + geom_point(size=2)
p <- p + stat_ellipse(level = 0.95, size = 1)
p <- p + scale_color_manual(values = cols) 
p <- p + geom_hline(yintercept = 0) 
p <- p + geom_vline(xintercept = 0) 
p <- p + theme_bw()
p

image.png

参考文章

  • https://grunwaldlab.github.io/Population_Genetics_in_R/gbs_analysis.html
  • https://popgen.nescent.org/DifferentiationSNP.html
  • https://popgen.nescent.org/StartSNP.html
本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2019-11-27,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 小明的数据分析笔记本 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 主成分分析
  • 参考文章
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档