使用data.table
包,我正在使用以下数据框架,该数据框架是由reproduce(df
生成的
outRes vars ts_length BIAS
1 1t sd 0 -0.046
2 1t sd 3 -0.105
3 1t sd 6 -0.249
4 1t sd 1 -0.024
5 1t sd 1 1.246
6 1t sd 6 0.885
7 1t sd 1 0.280
46 day sd 0 -0.061
47 day sd 3 -0.119
48 day sd 6 -0.256
49 day sd 1 -0.039
50 day sd 1 1.239
51 day sd 6 0.888
52 day sd 1 0.253
268 month LE 1 -0.085
269 month LE 3 -0.147
270 month LE 6 -0.305
df <- structure(list(outRes = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,3L, 3L, 3L),
.Label = c("1t", "day", "month"), class = "factor"),
vars = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L), .Label = c("H","LE", "sd", "sm2", "Ts2"), class = "factor"),
ts_length = structure(c(1L, 3L, 4L, 2L, 2L, 4L, 2L, 2L, 3L,4L), .Label = c("0", "1", "3", "6"), class = "factor"),
BIAS = c(-0.046,-0.105, -0.249, -0.024, 1.246, 0.885, 0.28, -0.085, -0.147,-0.305)),
.Names = c("outRes", "vars", "ts_length", "BIAS"), class = "data.frame",
row.names = c(1L, 2L, 3L, 4L, 5L, 6L,7L, 268L, 269L, 270L))
首先,我需要找到df$BIAS
中每组df$vars
和df$outRes
中的最低值。使用上面的例子,outRes=1t
和vars = sd
,最小偏差是-0.024,所以我需要打印ts_length
= "1";对于outRes = day
,最小的BIAS
= -0.061需要ts_length
=0。使用data.table
包,我可以输出BIAS
的值
dt = as.data.table(df)
dt[,min(abs(BIAS)),by="vars,outRes"]
这给了我输出
vars outRes V1
1: sd 1t 0.024
2: sm2 1t 2.615
3: Ts2 1t 0.000
4: H 1t 0.735
5: LE 1t 0.018
6: sd day 0.039
7: sm2 day 2.661 etc...
我想要做的也是获取与V1
列对应的V1
。我试过了
setkey(dt,outRes,vars,BIAS)
dt[J(dt[,min(abs(BIAS)),by="outRes,vars"])]
[V1== BIAS,list(ID,ts_length,BIAS,outRes,vars)]
但在$vars
的5个级别中,有2个消失了,给出了以下结果:
ts_length BIAS outRes vars
1: 3 0.018 1t LE
2: 0 2.615 1t sm2
3: 6 0.000 1t Ts2
4: 0 0.005 day LE
5: 0 2.661 day sm2
我是data.table
新手,我承认我对代码本身不太了解,所以我也尝试了
setkey(dt,vars,outRes,BIAS)
dt[J(dt[,min(abs(BIAS)),by="vars,outRes"])]
[V1== BIAS,list(ts_length,BIAS,vars,outRes)]
但我也只有三个等级。这是怎么回事?我怎么能拥有因子vars
的5级而不是仅仅3级呢?
发布于 2014-01-17 15:44:25
谢谢你这个可重复的例子。尝试以下几点:
setkey(dt, vars, outRes)
dt[ CJ(levels(vars), levels(outRes))
, .SD[abs(BIAS) == min(abs(BIAS))]
, .SDcols=c("BIAS", "ts_length")
]
vars outRes BIAS ts_length
1: H 1t NA NA
2: H day NA NA
3: H month NA NA
4: LE 1t NA NA
5: LE day NA NA
6: LE month -0.085 1
7: sd 1t -0.024 1
8: sd day NA NA
9: sd month NA NA
10: sm2 1t NA NA
11: sm2 day NA NA
12: sm2 month NA NA
13: Ts2 1t NA NA
14: Ts2 day NA NA
15: Ts2 month NA NA
https://stackoverflow.com/questions/21168693
复制相似问题