我对R比较陌生,在根据多个列中的相似性合并行时遇到了麻烦。我有以下数据集
LAST_NAME FIRST_NAME INTERVAL VISIT_DATE MFQ_1 MFQ_2 MFQ_3 Handedness ARI_1 ARI_2 ARI_4 ARI_COMPLETED_BY
Doe Jane Interval 1 1/1/99 4 6 2 Na Na Na Na Na
Doe Jane Interval 1 1/1/99 Na Na Na Right-Handed Na Na Na Na
Doe Jane Interval 1 1/1/99 Na Na Na Na 4 2 2 Dad
Doe Jane Interval 2 2/4/04 Na Na Na Right-Handed Na Na Na Na
Doe Jane Interval 2 2/4/04 5 6 3 Na Na Na Na Na
Doe Jane Interval 2 2/4/04 Na Na Na Na 4 5 5 Mom
Smith Joe Interval 1 3/1/01 5 1 7 Na Na Na Na Na
Smith Joe Interval 1 3/1/01 Na Na Na Left-Handed Na Na Na Na
Smith Joe Interval 1 3/1/01 Na Na Na Na 8 8 2 Dad
Smith Joe Interval 2 5/4/09 Na Na Na Na 8 5 4 Dad
Smith Joe Interval 2 5/4/09 7 2 8 Na Na Na Na Na
Smith Joe Interval 2 5/4/09 Na Na Na Left-Handed Na Na Na Na我想根据名称/间隔/日期合并行,这样看起来就像这样:
LAST_NAME FIRST_NAME INTERVAL VISIT_DATE MFQ_1 MFQ_2 MFQ_3 Handedness ARI_1 ARI_2 ARI_4 ARI_COMPLETED_BY
Doe Jane Interval 1 1/1/99 4 6 2 Right-Handed 4 2 2 Dad
Doe Jane Interval 2 2/4/04 5 6 3 Right-Handed 4 5 5 Mom
Smith Joe Interval 1 3/1/01 5 1 7 Left-Handed 8 8 2 Dad
Smith Joe Interval 2 5/4/09 7 2 8 Left-Handed 8 5 4 Dad我已经尝试了以下代码:
CTDB %>% group_by(LAST_NAME:VISIT_DATE) %>% summarise_all(funs(na.omit(.)))但是我得到了以下错误
Error in mutate_impl(.data, dots) : Evaluation error: NA/NaN argument.
In addition: Warning messages:
1: In LAST_NAME:VISIT_DATE :
numerical expression has 3326 elements: only the first used
2: In LAST_NAME:VISIT_DATE :
numerical expression has 3326 elements: only the first used
3: In evalq(LAST_NAME:VISIT_DATE, <environment>) :
NAs introduced by coercion
4: In evalq(LAST_NAME:VISIT_DATE, <environment>) :
NAs introduced by coercion我不确定如何解决这个问题才能得到想要的结果。任何帮助都将不胜感激!
发布于 2018-02-06 05:35:59
首先,您需要用显式的NA值替换"Na“字符串
CTDB[CTDB == "Na"] <- NA您也不能在grouping函数中使用:,因此我们将列出我们想要分组的列。然后用first()包装na.omit(),因为na.omit本身并不是一个聚合函数,它也不会告诉dplyr如何汇总。
CTDB %>% group_by(LAST_NAME, FIRST_NAME, INTERVAL, VISIT_DATE) %>%
summarize_all(funs(first(na.omit(.))))https://stackoverflow.com/questions/48629735
复制相似问题