我有一个名为Resultaat的数据格式
Cluster Number
W63 1020 NA NA NA 1100
W50 1020 NA 1240 NA NA我要删除所有NA值,以保留数字。列被定义为字符。
预期产出
Cluster Number
W63 1020 1100
W50 1020 1240 我试过像gsub("^NA(?:\\s+NA)*\\b\\s*|\\s*\\bNA(?:\\s+NA)*$", "", Resultaat$Number)和Resultaat <- Resultaat[!is.na(Resultaat)]这样的东西,但是没有什么效果
发布于 2022-09-22 15:03:42
这里有一个选项--用read.table和unite读取列'Number‘所有的列,不包括带na.rm = TRUE的NA元素
library(tidyr)
library(dplyr)
read.table(text = Resultaat$Number, header = FALSE, fill = TRUE) %>%
unite(Number, everything(), na.rm = TRUE, sep = " ") %>%
bind_cols(Resultaat[1], .)-output
Cluster Number
1 W63 1020 1100
2 W50 1020 1240关于gsub,它可以是
gsub("\\s+NA|NA\\s+|NA$|^NA", "", Resultaat$Number)
[1] "1020 1100" "1020 1240"也可以使用tidvyerse方法作为
library(dplyr)
library(tidyr)
library(stringr)
Resultaat %>%
separate_rows(Number) %>%
na_if("NA") %>%
drop_na() %>%
group_by(Cluster) %>%
summarise(Number = str_c(Number, collapse = " "))-output
# A tibble: 2 × 2
Cluster Number
<chr> <chr>
1 W50 1020 1240
2 W63 1020 1100数据
Resultaat <- structure(list(Cluster = c("W63", "W50"),
Number = c("1020 NA NA NA 1100",
"1020 NA 1240 NA NA")), class = "data.frame", row.names = c(NA,
-2L))发布于 2022-09-22 15:17:16
假设所有数字和NAs是分隔的:
library("tidyverse")
Resultaat$Number <- Resultaat$Number %>%
str_split(pattern = " ") %>%
map_chr(~ paste(.x[.x != "NA"], collapse = " "))发布于 2022-09-22 20:25:40
下面是带有模式regmatches的基本R选项
transform(
df,
Number = sapply(
regmatches(
Number,
gregexpr("[^(NA) ]+", Number)
),
paste0,
collapse = " "
)
)这给
Cluster Number
1 W63 1020 1100
2 W50 1020 1240https://stackoverflow.com/questions/73816863
复制相似问题