全基因组选择, 参考群需要建多大, 这篇文章用实际数据和模拟数据证明, 参考群至少要有500才有效果. 另外, 多性状SSGBLUP比单性状SSGBLUP要好. 所以, 学好传统的数量遗传学对于基因组选择也是有帮助的.
paper download
data download
We simulated two traits with heritabilities 0.1, 0.3, and with high genetic correlation 0.7, our results also showed that the prediction accuracies were low for GBLUP compared with other three methods with different genotyped reference population sizes and the accuracies were improved with increasing the genotyped reference population size. However, the increase was small for ssGBLUP compared with BLUP when the genotyped reference population size was <500. Our results also demonstrated that the accuracies of genomic prediction can be further improved by implementing two-trait ssGBLUP model, the maximum gain on accuracy was 2 and 2.6% for trait of chest width compared to single-trait ssGBLUP and traditional BLUP, while the gain was decreased with the weakness of genetic correlation. Two-trait ssGBLUP even performed worse than single trait analysis in the scenario of low genetic correlation.
Since the historic work of Meuwissen et al. (2001), combining genome data with corresponding statistical models has been successfully applied to genome selection. The key issue of genomic selection is to predict individual genomic breeding values (GEBV) using genome-wide marker information. Many statistical methods have been developed to predict GEBV, which are basically different in the assumption of distribution of SNP effects. The linear BLUP models (at either the SNP level or the individual animal level) assume that effects of all SNP are normally distributed with same variance (Meuwissen et al., 2001; VanRaden, 2008). On the other hand, the Bayesian Alphabet methods (e.g., BayesA, BayesB, and BayesCpi) (Meuwissen et al., 2001; Habier et al., 2011) allow each SNP effect to have its own variance. Many studies have reported that Bayesian methods performed similar to genomic BLUP (GBLUP) model in real data (Hayes et al., 2009a) and GBLUP is also simpler and lower computation-demanding than the Bayesian Alphabet methods.
Generally, genomic prediction utilizes information of genotyped animals. In practice, however, only a subset of individuals can be genotyped. Furthermore, in order to make use of phenotype information of non-genotyped individuals, a single-step GBLUP (ssGBLUP) has been developed by constructing H matrix using marker genotypes and pedigree jointly instead of G matrix or pedigree-based relationship matrix alone (Legarra et al., 2009; Christensen and Lund, 2010). Field data of cattle, pigs and chickens indicated that single-step method leads to higher accuracy and much simpler than multi-step genomic selection methods (Aguilar et al., 2011; Chen et al., 2011; Forni et al., 2011; Christensen et al., 2012; Simeone et al., 2012; Li et al., 2014; Song et al., 2017).
Genomic selection usually handles a single trait only. However, many traits are genetically correlated. As in traditional genetic evaluation, a multi-trait model is expected to increase the accuracy of the GEBV by making use of information from genetically correlated traits which will be more profound for traits with low heritability or with a small number of phenotypic records (Jia and Jannink, 2012; Guo et al., 2014). Many studies report multi-trait model for genetically correlated traits could lead to more accurate predictions than single trait genomic prediction (Calus and Veerkamp, 2011; Jia and Jannink, 2012; Guo et al., 2014; Wang et al., 2017).
矫正H矩阵
和常规ABLUP多性状分析模型类似.
不同方法, 不同性状的准确性和可靠性比较. 可以看出GBLUP相对于ABLUP是下降的, SSGBLUP提升也十分有限, 这主要是因为参考群个数太少, 准确性提高较少.
Compared to single-trait model, as shown in Table 4, the accuracies of genomic prediction for CW from two-trait ssGBLUP were increased from 0.684 (Table 3) to 0.703 and 0.697 in the scenarios of high and medium correlations but slightly decreased to 0.676 in low genetic correlation. The gain on accuracy was 2% in situation with high genetic correlation, 1.3% in medium genetic correlation.
As shown in Figure 2, the same validation population was predicted through different genotyped reference population sizes using BLUP, GBLUP, ssGBLUP, and two-trait ssGBLUP methods. The accuracies of prediction were low for GBLUP compared with other three methods in all scenarios, while the accuracy of genomic prediction from GBLUP was rapidly increased with increasing the reference population size, especially when the reference population size was enlarged over 500. The accuracy of traditional BLUP with the same reference and validation population as ssGBLUP was also shown in Figure 2. Generally, ssGBLUP provided higher accuracies of predictions than traditional BLUP in different genotyped reference population sizes for Trait A and Trait B, however, the increase was tiny especially in Trait A with low heritability of 0.1 when the genotyped reference population size was below 500, this was consistent with the results of real pig data in this study. In all scenarios, two-trait ssGBLUP produced the highest accuracy for Trait A and Trait B with high genetic correlation of 0.7, but the scope of improvement was low for Trait B with heritability of 0.3.