这可能是一件很愚蠢的事情,但我有一个data.frame并创建了一个过滤器,并且我使用带有dplyr::filter或base subsetting的变量(一个常量)没有相同的结果,首先是一个示例
tt <- data.frame( t = runif(100,max=100)) %>% mutate(period =trunc( (t+3) / 12))
i <- 0
tt %>% filter(period==0)
tt %>% filter(period==i)
tt[tt$period == i,]
结果是等价的
> tt %>% filter(period==0)
t period
1 4.047352 0
2 2.391890 0
3 6.050928 0
4 1.646503 0
5 2.335137 0
> tt %>% filter(period==i)
t period
1 4.047352 0
2 2.391890 0
3 6.050928 0
4 1.646503 0
5 2.335137 0
> tt[tt$period == i,]
t period
23 4.047352 0
47 2.391890 0
75 6.050928 0
93 1.646503 0
95 2.335137 0
然后,真正的(大的) data.frame我做了同样的操作,没有得到相同的结果。
patch_sparse <- patch_sparse %>% mutate(period = trunc( (t+3) / 12))
str(patch_sparse)
'data.frame': 768307 obs. of 7 variables:
$ t : num 1 1 1 1 1 1 1 1 1 1 ...
$ i : int 2864 2864 2864 2864 2876 2876 2875 2876 2875 2857 ...
$ j : int 3109 3110 3111 3112 3112 3113 3114 3114 3115 3116 ...
$ data : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
$ date : chr "2000-11-01" "2000-11-01" "2000-11-01" "2000-11-01" ...
$ region: chr "Australia" "Australia" "Australia" "Australia" ...
$ period: num 0 0 0 0 0 0 0 0 0 0 ...
#
i <- 0
patch_sparse %>% filter(period==0)
patch_sparse %>% filter(period==i)
patch_sparse[patch_sparse$period == i,]
其结果是:
> patch_sparse %>% filter(period==0)
t i j data date region period
1 1 2864 3109 TRUE 2000-11-01 Australia 0
2 1 2864 3110 TRUE 2000-11-01 Australia 0
3 1 2864 3111 TRUE 2000-11-01 Australia 0
...
142 2 3457 1524 TRUE 2000-12-01 Australia 0
[ reached 'max' / getOption("max.print") -- omitted 2346 rows ]
> patch_sparse %>% filter(period==i)
[1] t i j data date region period
<0 rows> (or 0-length row.names)
> patch_sparse[patch_sparse$period == i,]
t i j data date region period
1 1 2864 3109 TRUE 2000-11-01 Australia 0
2 1 2864 3110 TRUE 2000-11-01 Australia 0
3 1 2864 3111 TRUE 2000-11-01 Australia 0
..
142 2 3457 1524 TRUE 2000-12-01 Australia 0
[ reached 'max' / getOption("max.print") -- omitted 2346 rows ]
我试图用类似的结果将data.frame
更改为tibble
或将trunc()
更改为as.integer()
,但我无法得到一个可重复的示例。有什么想法吗?
发布于 2020-12-01 14:51:43
问题是,您的数据包含一列i。在tidyverse管道中,函数总是首先在数据中查找,因此,您基本上尝试使用patch_sparse %>% filter(period==i)
对句点等于数据的列i的行进行筛选。
因此,如果要根据外部标量进行筛选,请确保标量的名称与数据的列名不同,例如:
filter_i <- 0
patch_sparse %>% filter(period==filter_i)
https://stackoverflow.com/questions/65099593
复制