使用df,我创建了一个新的数据框(final.df),在df数据框中startdate和enddate之间的每个日期都有一行。
df <- data.frame(claimid = c("123A",
"125B",
"151C",
"124A",
"325C"),
startdate = as.Date(c("2018-01-01",
"2017-05-20",
"2017-12-15",
"2017-11-05",
"2018-02-06")),
enddate = as.Date(c("2018-01-06",
"2017-06-21",
"2018-01-02",
"2017-11-15",
"2018-02-18")))下面的嵌套函数是我当前用来创建final.df的函数,但是当循环遍历数十万个声明时,这种创建final.df的方法需要几个小时才能运行。我正在寻找能够更有效地创建final.df的替代方案。
claim_level <- function(a) {
specific_row <- df[a, ]
dates <- seq(specific_row$startdate, specific_row$enddate, by="days")
day_level <- function(b) {
day <- dates[b]
data.frame(claimid = specific_row$claimid, date = day)
}
do.call("rbind", lapply(c(1:length(dates)), function(b) day_level(b)))
}
final.df <- do.call("rbind", lapply(c(1:nrow(df)), function(a) claim_level(a)))
print(subset(final.df, claimid == "123A"))
#claimid date
#123A 2018-01-01
#123A 2018-01-02
#123A 2018-01-03
#123A 2018-01-04
#123A 2018-01-05
#123A 2018-01-06https://stackoverflow.com/questions/50648084
复制相似问题