基因组选择中,无论是GBLUP还是HBLUP,asreml都是一个很好的工具,功能强大,速度快,支持多性状模型。asremlw和asremlr都不能构建G逆矩阵或者H逆矩阵,幸运的是外界有很多软件可以构建,比如synbreed
,blupf90
,sommer
等,我也写了几个可以构建H矩阵和H逆矩阵的函数(链接),这样就可以引入外界构建好的逆矩阵,使用asreml进行基因组选择和育种值计算。
asreml4r上线后,增加了好几个功能,比如支持基因组大数据的分析,内存管理更优,多性状模型进行了进一步的优化。但是语法也变化了不少,让人很不习惯,这里记录一下其调用外部函数的异同点。希望对后来者有所帮助。如果有什么问题,邮件联系:dengfei_2013@163.com
要点
attr(hinv,"rowNames")
, 添加rowNames属性c("Row" ,"Column","Ainverse")
示例数据
names(hinv) = c("Row" ,"Column","Ainverse")
head(hinv)
attr(hinv,"rowNames") = as.character(idd$V1)
m2 = asreml(y1 ~ Sex + G + Pici + Dongshe + Year, random = ~ giv(ID), ginverse = list(ID=hinv), data=dat,workspace = 1e9)
summary(m2)$varcomp
示例结果
> head(hinv)
Row Column Ainverse
1: 1 1 8.333334
2: 2 2 15.333334
3: 3 3 9.666667
4: 4 4 13.333334
5: 5 5 14.666667
6: 6 6 13.333334
> m2 = asreml(y1 ~ Sex + G + Pici + Dongshe + Year, random = ~ giv(ID), ginverse = list(ID=hinv), data=dat,workspace = 1e9)
ASReml: Fri Jul 12 13:25:09 2019
LogLik S2 DF wall cpu
-30563.7230 192.7789 9857 13:26:11 58.9
-30561.8618 190.6048 9857 13:27:00 49.4
-30560.1468 187.5336 9857 13:27:52 51.8
-30559.3369 184.1737 9857 13:28:43 50.8
-30559.2888 183.2905 9857 13:29:33 50.8
-30559.2869 183.1108 9857 13:30:23 49.7
-30559.2869 183.0768 9857 13:31:14 51.2
-30559.2869 183.0705 9857 13:32:05 51.1
LogLikelihood Converged
> summary(m2)$varcomp
gamma component std.error z.ratio constraint
giv(ID).giv 0.1753523 32.10183 4.414356 7.272144 Positive
R!variance 1.0000000 183.07051 4.111634 44.525004 Positive
现在asreml的lic都是4版了,语法有了变化。
要点
attr(hinv,"rowNames")
, 添加rowNames属性attr(hinv,"INVERSE") = TRUE
hinv = as.matrix(hinv)
和asreml3r比较:
示例代码
hinv = as.matrix(hinv)
attr(hinv,"rowNames") = as.character(idd$V1)
attr(hinv,"INVERSE") = TRUE
head(hinv)
m2 = asreml(y1 ~ Sex + G + Pici + Dongshe + Year, random = ~ vm(ID,hinv),workspace=2e9, data=dat)
summary(m2)$varcomp
示例结果
> hinv = as.matrix(hinv)
> attr(hinv,"rowNames") = as.character(idd$V1)
> attr(hinv,"INVERSE") = TRUE
> head(hinv)
V1 V2 V3
[1,] 1 1 8.333334
[2,] 2 2 15.333334
[3,] 3 3 9.666667
[4,] 4 4 13.333334
[5,] 5 5 14.666667
[6,] 6 6 13.333334
> m2 = asreml(y1 ~ Sex + G + Pici + Dongshe + Year, random = ~ vm(ID,hinv),workspace=2e9, data=dat)
Model fitted using the gamma parameterization.
Allocating main workspace...done!
LogLik Sigma2 DF wall cpu
1 -31276.55 192.779 9857 13:46:16 130.2
2 -31274.69 190.606 9857 13:46:59 43.0
3 -31272.98 187.536 9857 13:47:40 41.2
4 -31272.27 184.996 9857 13:48:22 41.3
5 -31272.12 183.477 9857 13:49:04 41.6
6 -31272.12 183.147 9857 13:49:35 31.9
7 -31272.12 183.084 9857 13:50:07 31.5
Terms with zero df listed in attribute 'zerodf' of the wald table.
> summary(m2)$varcomp
component std.error z.ratio bound %ch
vm(ID, hinv) 32.10035 4.412698 7.274541 P 0.1
units!R 183.08358 4.111247 44.532370 P 0.0
相比较而言,blupf90
花费了9分钟,方差组分估算结果一致,结果如下:
Final Estimates
Genetic variance(s) for effect 5
32.102
Residual variance(s)
183.07
inverse of AI matrix (Sampling Variance)
19.488 -12.766
-12.766 16.906
Correlations from inverse of AI matrix
1.0000 -0.70330
-0.70330 1.0000
SE for G
4.4145
SE for R
4.1117
* FINISHED (AIREMLF90): 07-14-2019 14h 08m 11s 456
real 9m19.279s
user 73m40.613s
sys 2m13.071s
愉快的拥抱新版本吧。