首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >当两个列满足R中的某些条件时创建一个新列

当两个列满足R中的某些条件时创建一个新列
EN

Stack Overflow用户
提问于 2022-01-02 08:24:27
回答 3查看 58关注 0票数 1

我的数据如下:

代码语言:javascript
运行
复制
country    supporter1   supporter2   supporter3  supporter4    supporter5    
USA           Albania     Germany        USA           NA           NA
France        USA         France         NA            NA           NA
UK            UK          Chile          Peru          NA           NA
Germany       USA         Iran           Mexico        India        Pakistan
USA           China       Spain          NA            NA           NA
Cuba          Cuba        UK             Germany       South Korea  NA
China         Russia      NA             NA            NA           NA

我想要做的是创建一个新的变量,当国家列和剩下的一个支持者列(支持者1、支持者2、支持者3、支持者4和支持者5)相同时(例如,法国和supporter2法国是相同的)。在这种情况下,新变量应该取1,0否则。

我希望有这样的:

代码语言:javascript
运行
复制
country    supporter1   supporter2   supporter3  supporter4    supporter5      new variable  
USA          Albania     Germany        USA           NA           NA               1
France       USA         France         NA            NA           NA               1
UK           UK          Chile          Peru          NA           NA               1
Germany      USA         Iran           Mexico        India        Pakistan         0
USA          China       Spain          NA            NA           NA               0         
Cuba         Cuba        UK             Germany       South Korea  NA               1
China        Russia      NA             NA            NA           NA               0
EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2022-01-02 09:03:41

只使用if_any更新dplyr解决方案

代码语言:javascript
运行
复制
library(dplyr)
df %>% 
  rowwise() %>% 
  mutate(new_var = as.integer(as.logical(if_any(starts_with("supporter"), ~ . %in% country))))
代码语言:javascript
运行
复制
  country supporter1 supporter2 supporter3 supporter4  supporter5 new_var
  <chr>   <chr>      <chr>      <chr>      <chr>       <chr>        <int>
1 USA     Albania    Germany    USA        NA          NA               1
2 France  USA        France     NA         NA          NA               1
3 UK      UK         Chile      Peru       NA          NA               1
4 Germany USA        Iran       Mexico     India       Pakistan         0
5 USA     China      Spain      NA         NA          NA               0
6 Cuba    Cuba       UK         Germany    South Korea NA               1
7 China   Russia     NA         NA         NA          NA               0

第一个答案:也是正确的:这里有一个可能的解决方案:

如果country是rowwise

  • check,那么
  1. 在cols supporter1 to supporter5中计算,如果所有的新列都是一个,并且使用ifelse语句,则取10

代码语言:javascript
运行
复制
library(dplyr)
library(stringr)
library(tidyr)

df %>% 
  rowwise() %>% 
  mutate(across(supporter1:supporter5, ~ifelse(. %in% country, 1,0), .names = "new_{col}")) %>% 
  unite(New_Col, starts_with('new'), na.rm = TRUE, sep = ' ') %>% 
  mutate(New_Col = ifelse(str_detect(New_Col,  "1"), 1,0))
代码语言:javascript
运行
复制
  country supporter1 supporter2 supporter3 supporter4  supporter5 New_Col
  <chr>   <chr>      <chr>      <chr>      <chr>       <chr>        <dbl>
1 USA     Albania    Germany    USA        NA          NA               1
2 France  USA        France     NA         NA          NA               1
3 UK      UK         Chile      Peru       NA          NA               1
4 Germany USA        Iran       Mexico     India       Pakistan         0
5 USA     China      Spain      NA         NA          NA               0
6 Cuba    Cuba       UK         Germany    South Korea NA               1
7 China   Russia     NA         NA         NA          NA               0
票数 3
EN

Stack Overflow用户

发布于 2022-01-02 09:16:36

这是一个基本的R解。

首先,mapply检查suporter*country的相等性。NA被认为是返回FALSE,然后as.integer/rowSums将至少一个TRUE的行转换为1,否则为0。

代码语言:javascript
运行
复制
eq <- mapply(\(x, y){x == y & !is.na(x)}, df1[-1], df1[1])
as.integer(rowSums(eq) != 0)
#[1] 1 1 1 0 0 1 0

df1$new_variable <- as.integer(rowSums(eq) != 0)

数据

代码语言:javascript
运行
复制
df1 <- read.table(text = "
country    supporter1   supporter2   supporter3  supporter4    supporter5    
USA           Albania     Germany        USA           NA           NA
France        USA         France         NA            NA           NA
UK            UK          Chile          Peru          NA           NA
Germany       USA         Iran           Mexico        India        Pakistan
USA           China       Spain          NA            NA           NA
Cuba          Cuba        UK             Germany       'South Korea'  NA
China         Russia      NA             NA            NA           NA
", header = TRUE)
票数 2
EN

Stack Overflow用户

发布于 2022-01-02 09:12:40

另一种解决方案是检查每一行中是否存在country

代码语言:javascript
运行
复制
df <- data.frame(country=c("USA","France","UK","Germany","USA","Cuba","China"),
supporter1=c("Albania","USA","UK","USA","China","Cuba","Russia"),
supporter2=c("Germany","France","Chile","Iran","Spain","UK","NA"),  
supporter3=c("USA","NA","Peru","Mexico","NA","Germany","NA"),
supporter4=c("NA","NA","NA","India","NA","South Korea","NA"),   
supporter5=c("NA","NA","NA","Pakistan","NA","NA","NA"))

这将使:

代码语言:javascript
运行
复制
df$new <- sapply(seq(1,nrow(df)), function(x) ifelse(df$country[x] %in% df[x,2:6],1,0))
> df$new
[1] 1 1 1 0 0 1 0
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/70554560

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档