文献阅读: ABLUP-GBLUP-SSGBLUP模拟数据比较

小编自语:

全基因组选择, 参考群需要建多大, 这篇文章用实际数据和模拟数据证明, 参考群至少要有500才有效果. 另外, 多性状SSGBLUP比单性状SSGBLUP要好. 所以, 学好传统的数量遗传学对于基因组选择也是有帮助的.

文献下载及数据下载

paper download

data download

1 摘要

We simulated two traits with heritabilities 0.1, 0.3, and with high genetic correlation 0.7, our results also showed that the prediction accuracies were low for GBLUP compared with other three methods with different genotyped reference population sizes and the accuracies were improved with increasing the genotyped reference population size. However, the increase was small for ssGBLUP compared with BLUP when the genotyped reference population size was <500. Our results also demonstrated that the accuracies of genomic prediction can be further improved by implementing two-trait ssGBLUP model, the maximum gain on accuracy was 2 and 2.6% for trait of chest width compared to single-trait ssGBLUP and traditional BLUP, while the gain was decreased with the weakness of genetic correlation. Two-trait ssGBLUP even performed worse than single trait analysis in the scenario of low genetic correlation.

  • 在基因组预测中, 一步法要比多步法有很多优势. 本研究调查了Yorkshire的7中身体测量性状, 以及模拟数据来比较SSBLUP的效率
  • 在Yorkshire群体中, 有592个体有基因型数据, PorcineSNP80的芯片
  • 比较常规ABLUP和单性状SSGBLUP以及多性状SSGBLUP, 以及GBLUP方法
  • 结果显示, GBLUP相对于ABLUP准确性降低
  • SSGBLUP相对于传统ABLUP, 准确性提高1%, 准确性提高的比较低主要是因为参考群数目较低(592头)
  • 如果模拟少量的测序个体, 进行分析, 也显示SSGBLUP相对于ABLUP提高的也比较低, 提高了0.6%. 如果两性状遗传相关比较高(0.7), 使用双性状SSGBLUP, 准确性能进一步提高, 能提高2.6%. 如果两性状遗传相关比较低, 那么双性状SSGBLUP相对于单性状SSGBLUP没有优势, 甚至还不如单性状SSGBLUP

2 方法介绍

2.1 GBLUP VS 贝叶斯

Since the historic work of Meuwissen et al. (2001), combining genome data with corresponding statistical models has been successfully applied to genome selection. The key issue of genomic selection is to predict individual genomic breeding values (GEBV) using genome-wide marker information. Many statistical methods have been developed to predict GEBV, which are basically different in the assumption of distribution of SNP effects. The linear BLUP models (at either the SNP level or the individual animal level) assume that effects of all SNP are normally distributed with same variance (Meuwissen et al., 2001; VanRaden, 2008). On the other hand, the Bayesian Alphabet methods (e.g., BayesA, BayesB, and BayesCpi) (Meuwissen et al., 2001; Habier et al., 2011) allow each SNP effect to have its own variance. Many studies have reported that Bayesian methods performed similar to genomic BLUP (GBLUP) model in real data (Hayes et al., 2009a) and GBLUP is also simpler and lower computation-demanding than the Bayesian Alphabet methods.

  • 基因组选择的关键点, 在于计算基因组育种值(GEBV), 许多统计模型可以做这个事情
  • BLUP方法(GBLUP, SSGBLUP)假定所有的标记效应值都是一样的方差
  • 贝叶斯方法(BayesA, BayesB, BayesCpi等)假定每个SNP有自己的方差
  • 研究表明, 贝叶斯的方法在真实数据里面, 效果和GBLUP类似, 但是GBLUP方法运算更快, 更方便.

2.2 一步法以及多性状一步法

Generally, genomic prediction utilizes information of genotyped animals. In practice, however, only a subset of individuals can be genotyped. Furthermore, in order to make use of phenotype information of non-genotyped individuals, a single-step GBLUP (ssGBLUP) has been developed by constructing H matrix using marker genotypes and pedigree jointly instead of G matrix or pedigree-based relationship matrix alone (Legarra et al., 2009; Christensen and Lund, 2010). Field data of cattle, pigs and chickens indicated that single-step method leads to higher accuracy and much simpler than multi-step genomic selection methods (Aguilar et al., 2011; Chen et al., 2011; Forni et al., 2011; Christensen et al., 2012; Simeone et al., 2012; Li et al., 2014; Song et al., 2017).

  • 后代全部测定, 成本太高, 我们可以测定一部分个体, 然后通过系谱+基因型构建H矩阵, 进行一步法估算SSGBLUP, 更具有操作性
  • 在奶牛, 猪, 鸡实际分析中, 显示多性状模型比单性状模型预测的准确性更高

Genomic selection usually handles a single trait only. However, many traits are genetically correlated. As in traditional genetic evaluation, a multi-trait model is expected to increase the accuracy of the GEBV by making use of information from genetically correlated traits which will be more profound for traits with low heritability or with a small number of phenotypic records (Jia and Jannink, 2012; Guo et al., 2014). Many studies report multi-trait model for genetically correlated traits could lead to more accurate predictions than single trait genomic prediction (Calus and Veerkamp, 2011; Jia and Jannink, 2012; Guo et al., 2014; Wang et al., 2017).

2.3 多性状SSGBLUP应用范围

  • 性状间有遗传相关, 特别是遗传力比较低时
  • 表型数据比较少时, 多性状模型更好

2.4 ABLUP

  • 固定因子: 栋舍 + 场年季 + 性别
  • 随机因子: 加性效应

2.5 GBLUP

2.6 单性状SSGBLUP

矫正H矩阵

  • 因为有些性状G不能解释所有变异, 设置其能解释95%的变异, 剩下的系谱解释5%的变异
  • 根据G矩阵和A22矩阵的对角线和非对角线方程, 计算 alpha和beta

2.7 两性状SSGBLUP

和常规ABLUP多性状分析模型类似.

3 结论

3.1 遗传力, 遗传相关, 表型相关

  • 对角线: 遗传力
  • 上三角: 遗传相关
  • 下三角: 表型相关

3.2 ABLUP VS GBLUP VS SSGBLUP

不同方法, 不同性状的准确性和可靠性比较. 可以看出GBLUP相对于ABLUP是下降的, SSGBLUP提升也十分有限, 这主要是因为参考群个数太少, 准确性提高较少.

3.3 单性状SSGBLUP VS 多性状SSGBLUP

Compared to single-trait model, as shown in Table 4, the accuracies of genomic prediction for CW from two-trait ssGBLUP were increased from 0.684 (Table 3) to 0.703 and 0.697 in the scenarios of high and medium correlations but slightly decreased to 0.676 in low genetic correlation. The gain on accuracy was 2% in situation with high genetic correlation, 1.3% in medium genetic correlation.

  • 遗传相关比较高和中等的性状, 双性状分析准确性提高.
  • 遗传相关较低的性状, 双性状分析, 准确性提高不明显, 有轻微降低

3.4 参考群大小对准确性的影响

As shown in Figure 2, the same validation population was predicted through different genotyped reference population sizes using BLUP, GBLUP, ssGBLUP, and two-trait ssGBLUP methods. The accuracies of prediction were low for GBLUP compared with other three methods in all scenarios, while the accuracy of genomic prediction from GBLUP was rapidly increased with increasing the reference population size, especially when the reference population size was enlarged over 500. The accuracy of traditional BLUP with the same reference and validation population as ssGBLUP was also shown in Figure 2. Generally, ssGBLUP provided higher accuracies of predictions than traditional BLUP in different genotyped reference population sizes for Trait A and Trait B, however, the increase was tiny especially in Trait A with low heritability of 0.1 when the genotyped reference population size was below 500, this was consistent with the results of real pig data in this study. In all scenarios, two-trait ssGBLUP produced the highest accuracy for Trait A and Trait B with high genetic correlation of 0.7, but the scope of improvement was low for Trait B with heritability of 0.3.

  • 模拟数据显示, GBLUP的准确性要差于ABLUP和SSGBLUP, 但是随着参考群的增加, 特别是在500以上时, 显著提升
  • 整体来说, 准确性的排名, 双性状SSGBLUP > 单性状SSGBLUP > ABLUP
  • 在参考群小于500时, SSGBLUP相对于ABLUP, 提升的效果不明显, 这种结论在实际数据中也有体现.
  • 在所有的模拟中, 双性状SSGBLUP要好于单性状SSGBLUP, 遗传相关高的双性状SSGBLUP要好于遗传相关低的性状.

原文发布于微信公众号 - 育种数据分析之放飞自我(R-breeding)

原文发表时间:2019-04-24

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

扫码关注云+社区

领取腾讯云代金券