我有一个很大的数据帧,它具有以下通用格式
week1 <- c("2.30", "14.10", "5.60")
week2 <- c("NA", "13.95", "NA")
week3 <- c("NA", "14.15", "5.30")
week4 <- c("2.30", "NA", "5.60")
week5 <- c("2.25", "14.10", "5.55")
week6 <- c("2.00", "14.00", "NA")
week7 <- c("1.95", "14.15", "5.60")
df <- data.frame(week1, week2, week3, week4, week5, week6, week7)现在,我正在尝试通过使用行式移动平均值来填充NAs的方法,其中我希望平均值一次基于4个观测值,而不使用循环。最好,我将能够从左到右工作,反之亦然(在第二个操作中)。
我是编程新手,非常感谢大家的帮助!
发布于 2021-02-12 04:25:59
这样做有点老生常谈。我不确定你的数据是数字还是像你在几周中输入的那样像字符一样,但这仍然有效。在mutate(value = as.numeric(value))中会有一个警告,但是如果你的数据实际上是数字的,你可以忽略它/不应该有这个问题。
df %>%
rownames_to_column("id_col") %>%
gather(week, value, -1) %>%
mutate(value = as.numeric(value)) %>%
group_by(id_col) %>%
mutate(value_no_na = zoo::rollapply(value, na.rm=TRUE, FUN="mean", width=4, fill=NA, align = "center")) %>%
tidyr::fill(value_no_na, .direction = "up") %>%
tidyr::fill(value_no_na, .direction = "down") %>%
ungroup() %>%
mutate(value = ifelse(is.na(value), value_no_na, value)) %>%
select(-value_no_na) %>%
spread(week, value) %>%
select(-id_col)对于逆序,您可以这样做
df %>%
select(ncol(df):1) %>%
rownames_to_column("id_col") %>%
gather(week, value, -1) %>%
mutate(value = as.numeric(value)) %>%
group_by(id_col) %>%
mutate(value_no_na = zoo::rollapply(value, na.rm=TRUE, FUN="mean", width=4, fill=NA, align = "center")) %>%
tidyr::fill(value_no_na, .direction = "up") %>%
tidyr::fill(value_no_na, .direction = "down") %>%
ungroup() %>%
mutate(value = ifelse(is.na(value), value_no_na, value)) %>%
select(-value_no_na) %>%
spread(week, value) %>%
select(-id_col)最后,您可以在rollapply中调整对齐,以决定移动平均线是左对齐、右对齐还是居中对齐。
https://stackoverflow.com/questions/66159732
复制相似问题