我有一个n列m行的矩阵和一个f函数列表。每个函数获取矩阵的一行,并返回单个值p。
生成f列乘m行矩阵的最佳方法是什么?
目前我正在做这件事:
# create a random 5x5 matrix
m <- matrix(rexp(25, rate=.1), ncol=5)
# example functions, in reality more complex but with the same signature
fs <- list(function(xs) { return(mean(xs)) }, function(xs) { return(min(xs)) } )
# create a function which takes a function and applies it to each row of m
g <- function(f) { return(apply(m, 1, f)) }
# use lapply to make a call for each function in fs
# use do.call and cbind to reshape the output from a list of lists to a matrix
do.call("cbind", lapply(fs, g))
澄清编辑:上面的代码确实可以工作,但我想知道是否有更优雅的方法。
发布于 2018-06-27 09:00:02
这就是我如何调整@patL的answer来获取一个函数列表:
# create a random 5x5 matrix
m <- matrix(rexp(25, rate=.1), ncol=5)
# example functions, in reality more complex but with the same signature
fs <- list(function(xs) { return(mean(xs)) }, function(xs) { return(min(xs)) } )
# create a function which takes a function and applies it to each row of m
g <- function(f) { return(apply(m, 1, f)) }
# use sapply to make a call for each function in fs
# use cbind to reshape the output from a list of lists to a matrix
cbind(sapply(fs, g))
我用它来给一组模型打分,例如:
# models is a list of trained models and m is a matrix of input data
g <- function(model) { return(predict(model, m)) }
# produce a matrix of model scores
cbind(sapply(models, g))
发布于 2018-06-26 15:18:42
使用base
R,您可以在一行中完成此操作:
cbind(apply(m, 1, mean), apply(m, 1, min))
# [,1] [,2]
#[1,] 13.287748 5.2172657
#[2,] 5.855862 1.8346868
#[3,] 8.077236 0.4162899
#[4,] 10.422803 1.5899831
#[5,] 10.283001 2.0444687
这比do.call
方法更快:
microbenchmark::microbenchmark(
do.call("cbind", lapply(fs, g)),
cbind(apply(m, 1, mean), apply(m, 1, min))
)
其中包括:
#Unit: microseconds
# expr min lq mean
# do.call("cbind", lapply(fs, g)) 66.077 67.210 88.75483
# cbind(apply(m, 1, mean), apply(m, 1, min)) 57.771 58.903 67.70094
# median uq max neval
# 67.965 71.741 851.446 100
# 59.658 60.036 125.735 100
发布于 2018-06-26 17:17:04
使用数据:
set.seed(11235813)
m <- matrix(rexp(25, rate=.1), ncol=5)
fs <- c("mean", "median", "sd", "max", "min", "sum")
您可以执行以下操作:
sapply(fs, mapply, split(m, row(m)), USE.NAMES = T)
它返回:
mean median sd max min sum
[1,] 9.299471 3.531394 10.436391 26.37984 1.7293010 46.49735
[2,] 8.583419 2.904223 11.714482 28.75344 0.7925614 42.91709
[3,] 6.292835 4.578894 6.058633 16.92280 1.8387221 31.46418
[4,] 10.699276 5.688477 15.161685 36.91369 0.1049507 53.49638
[5,] 9.767307 2.748114 10.767438 24.66143 1.5677153 48.83653
注意:
与上面提出的两种方法相比,它是最慢的一种。
https://stackoverflow.com/questions/51036666
复制相似问题