我想在r中创建一个多维滚动窗口。下面是我在Python中使用xarray
库及其滚动函数所做的一个示例,它非常直观且简单:
import xarray as xr
import numpy as np
data = xr.DataArray(np.arange(0,18).reshape(3,3,2))
print(data)
<xarray.DataArray (dim_0: 3, dim_1: 3, dim_2: 2)>
array([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15],
[16, 17]]])
#Now the constructing the rolling window
win_size = 3 # Window size
data_roll = data.rolling(
dim_0=win_size, dim_1=win_size, center=True).construct(
dim_0="new_dim0", dim_1="newdim1")
print(data_roll)
<xarray.DataArray (dim_0: 3, dim_1: 3, dim_2: 2, new_dim0: 3, newdim1: 3)>
array([[[[[nan, nan, nan],
[nan, 0., 2.],
[nan, 6., 8.]],
[[nan, nan, nan],
[nan, 1., 3.],
[nan, 7., 9.]]],
[[[nan, nan, nan],
[ 0., 2., 4.],
[ 6., 8., 10.]],
[[nan, nan, nan],
[ 1., 3., 5.],
[ 7., 9., 11.]]],
[[[nan, nan, nan],
[ 2., 4., nan],
...
请注意,此函数在滚动前用NAs在所有维度中填充矩阵。我在R包中查找了几个函数,比如辊式应用。大多数这些函数和建议的方法要么用于在1d上应用函数,要么在最多的2d数组上应用。然而,我所需要的只是实际的窗口。我想知道怎样才能在R中得到同样的结果?
谢谢
发布于 2022-01-07 17:15:44
您试过使用包reticulate
吗?如果您想要Python中的特定内容,它允许您在R中使用Python中的任何内容,用美元符号替换期间,否则,它几乎是相同的(除了包导入)。
library(reticulate)
np <- import("numpy", convert = F)
xr <- import("xarray", convert = F)
data = xr$DataArray(np$arange(0,18)$reshape(3L,3L,2L))
data
# <xarray.DataArray (dim_0: 3, dim_1: 3, dim_2: 2)>
# array([[[ 0., 1.],
# [ 2., 3.],
# [ 4., 5.]],
#
# [[ 6., 7.],
# [ 8., 9.],
# [10., 11.]],
#
# [[12., 13.],
# [14., 15.],
# [16., 17.]]])
# Dimensions without coordinates: dim_0, dim_1, dim_2
发布于 2022-01-07 20:09:51
好吧,所以我才意识到这和R基地有多简单(我觉得很不好意思!)
(data2 <- array(1:18, dim = c(3, 2, 3)))
# , , 1
#
# [,1] [,2]
# [1,] 1 4
# [2,] 2 5
# [3,] 3 6
#
# , , 2
#
# [,1] [,2]
# [1,] 7 10
# [2,] 8 11
# [3,] 9 12
#
# , , 3
#
# [,1] [,2]
# [1,] 13 16
# [2,] 14 17
# [3,] 15 18
#
如果您不知道,将整个东西封装在()中将导致它同时创建对象和打印。
发布于 2022-01-07 22:02:44
好的,花了一段时间后,我想出了一个具体问题的答案。我编写的函数对3d矩阵很好,我认为有更好的方法来改进它:
rolling_win <- function(v,win) {
# This function pad a 3d matrix and returns the overlapping windows in a list
# Inputs are:
#v: the original matrix
#win: the window size
mylist <- c()
padded_data <- array(numeric(),c(dim(v)[1]+win*2,dim(v)[2]+win*2,dim(v)[3])) # pad data
padded_data[(win+1):(dim(padded_data)[1]-win),(win+1):(dim(padded_data)[2]-win),]<-v
counter = 1
for (i in (1+win):(dim(padded_data)[1]-win)){
for (j in (1+win):(dim(padded_data)[2]-win)){
a <- padded_data[(i-win):(i+win),(j-win):(j+win),]
mylist[[counter]]<-a
counter = counter+1
}
}
return(mylist)
}
# lets test it on a simple 3*3*2 matrix
v = seq(1:18)
dim(v) = c(3,3,2)
win=1
print(v)
, , 1
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
, , 2
[,1] [,2] [,3]
[1,] 10 13 16
[2,] 11 14 17
[3,] 12 15 18
# Apply the function
windows <- rolling_win(v,win)
#Lets take a look at the first element
windows[1]
, , 1
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA 1 4
[3,] NA 2 5
, , 2
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA 10 13
[3,] NA 11 14
虽然这个函数对我的情况很好,但很明显它是无效的。如果矩阵很大(数以百万计的元素),那么这根本行不通。Python中的Numpy使用了步幅概念,这一概念更加高效。我对R并不熟悉,也不确定R是否可以使用相同的概念。此外,泛化也是这里的另一个问题。
https://stackoverflow.com/questions/70624696
复制相似问题