如果我的数据帧看起来像这样
employee <- c('John','Peter','Jolie', 'Susan', 'Mark', 'David')
a <- c(21000, NA, 26800, NA, NA ,NA)
b <- c(NA, 35000, NA, NA, NA, NA)
c <- c(NA,NA,NA,40000, 20000, 45000)
d <- as.Date(c('2010-11-1','2008-3-25',NA, NA,'2007-3-14', NA))
e <- as.Date(c(NA, NA, NA,'2014-6-1', NA,'2017-10-15'))
f <- as.Date(c( NA, NA,'2010-3-20',NA, NA, NA))
employ.data <- data.frame(employee, a,b,c,d,e,f)有没有办法强制数据删除NAs并将其整合到三列?所以它看起来像这样:
John 21000 2010-11-1
Peter 35000 2008-3-25
Jolie 26800 2010-3-20
Susan 40000 2014-6-1
Mark 20000 2007-3-14
David 45000 2017-10-15发布于 2021-11-22 21:33:18
一个选项是分别从dplyr - coalesce列a到c和d到f执行coalesce
library(dplyr)
employ.data %>%
transmute(employee, a = coalesce(a, b, c),
date = coalesce(d, e, f))-output
employee a date
1 John 21000 2010-11-01
2 Peter 35000 2008-03-25
3 Jolie 26800 2010-03-20
4 Susan 40000 2014-06-01
5 Mark 20000 2007-03-14
6 David 45000 2017-10-15或者使用base R
type.convert(as.data.frame(t(apply(employ.data, 1,
FUN = function(x) na.omit(x)))), as.is = TRUE)发布于 2021-11-22 22:01:14
这是一个基础R解决方案,也可以适应数据帧中的上述更改。也许就性能而言,这不是最好的解决方案,但它应该是可行的
data_return <- data.frame(matrix(NA, nrow(employ.data), 3))
i_lauf <- 1
tmp <- ""
for(i in 1:nrow(employ.data)) # i=1
{
if( (ncol(employ.data) - sum(is.na(employ.data[i,]), na.rm = FALSE)) == 3) # chec for 3 entries
{
tmp <- ""
for(j in 1:ncol(employ.data))
{
if( is.na(employ.data[i,j]) == FALSE)
{
if(j == 1)
{
tmp <- paste(employ.data[i,j])
}else{
tmp <- paste(tmp, employ.data[i,j], sep = "_")
}
i_lauf <- i_lauf + 1
}
}
data_return[i, ] <- data.frame(matrix(strsplit(tmp, "_")[[1]], 1, 3))
}
}问候
https://stackoverflow.com/questions/70072603
复制相似问题