假设你有这样的东西:
Col1 Col2
a odd from 1 to 9
b even from 2 to 14
c even from 30 to 50
...我想通过将间隔分成单独的行来扩展行,因此:
Col1 Col2
a 1
a 3
a 5
...
b 2
b 4
b 6
...
c 30
c 32
c 34
...请注意,当它说“偶数从”时,上下界也是偶数,奇数也是如此。
发布于 2018-02-09 21:58:42
将Col2分隔为单独的列,然后为每一行创建序列:
library(dplyr)
library(tidyr)
DF %>%
separate(Col2, into = c("parity", "X1", "from", "X2", "to")) %>%
group_by(Col1) %>%
do(data.frame(Col2 = seq(.$from, .$to, 2))) %>%
ungroup注1
可重现形式的输入DF假定为:
DF <- structure(list(Col1 = c("a", "b", "c"), Col2 = c("odd from 1 to 9",
"even from 2 to 14", "even from 30 to 50")), .Names = c("Col1",
"Col2"), row.names = c(NA, -3L), class = "data.frame")注2
tidyr的下一个版本支持在into矢量中使用NA来表示要忽略的字段,因此可以编写上面的separate语句:
separate(Col2, into = c("parity", NA, "from", NA, "to")) %>% 发布于 2018-02-09 22:00:53
使用tidyverse
library(tidyverse)
df %>% mutate(Col2 = map(str_split(Col2," "),
~seq(as.numeric(.[3]),as.numeric(.[5]),2))) %>%
unnest或者,借用@g-grothendieck的解决方案中的separate,可读性更好一些:
df %>%
separate(Col2,as.character(1:5),convert=TRUE) %>%
transmute(Col1,Col2 = map2(`3`,`5`,seq,2)) %>%
unnest发布于 2018-02-09 22:07:47
这里有一个使用base R的选项。我们使用gregexpr/regmatches将'Col2‘中的数字元素提取到一个list中,然后使用seq获取2的元素序列,并将其stack到data.frame
res <- stack(setNames(lapply(regmatches(DF$Col2, gregexpr("\\d+", DF$Col2)), function(x)
seq(as.numeric(x[1]), as.numeric(x[2]), by = 2)), DF$Col1))[2:1]
colnames(res) <- colnames(DF)
head(res)
# Col1 Col2
#1 a 1
#2 a 3
#3 a 5
#4 a 7
#5 a 9
#6 b 2https://stackoverflow.com/questions/48707294
复制相似问题