在使用dataframe时,我希望根据另一列中的值来操作列值。下面是我的可重现代码:
# four items
items <- c("coke", "tea", "shampoo","aspirin")
# scores for each item
score <- as.numeric(c(65,30,45,20))
# making a data frame of the two vectors created
df <- as.data.frame(cbind(items,score))
# score for coke is 65 and for tea it is 30. I want to
# double score for tea OR coke if the score is below 50
ifelse(df$score[df$items %in% c("coke", "tea")] < 50, df$score*2, df$score)
#the above return NULL values with warning
#the statement df$score[df$items %in% c("coke", "tea")] does pull coke and tea scores
df$score[df$items %in% c("coke", "tea")]
非常感谢您的帮助
发布于 2015-12-16 18:45:05
现在应该可以做到这一点:
items <- c("coke", "tea", "shampoo","aspirin")
# scores for each item
score <- as.numeric(c(65,30,45,20))
尝试使用data.frame
而不是as.data.frame
。使用后一种方法会将值转换为系数
# making a data frame of the two vectors created
df <- data.frame(items, score)
df
items score
1 coke 65
2 tea 30
3 shampoo 45
4 aspirin 20
# score for coke is 65 and for tea it is 30. I want to
# double score for tea OR coke if the score is below 50
df$score[df$items %in% c("coke", "tea")] = ifelse(df$score[df$items %in% c("coke", "tea")] < 50, df$score*2, df$score)
df
items score
1 coke 65
2 tea 60
3 shampoo 45
4 aspirin 20
但是,如果最终有重复的条目,则此方法不起作用。
# New data with an added entry for item = coke and score = 15:
items <- c("coke", "tea", "shampoo","aspirin","coke")
# scores for each item
score <- c(65,30,45,20,15)
# making a data frame of the two vectors created
df <- data.frame(items, score)
# using the method from above the last entry get converted to a value of 90
# instead of 30
df$score[df$items %in% c("coke", "tea")] = ifelse(df$score[df$items %in% c("coke", "tea")] < 50, df$score*2, df$score)
df
items score
1 coke 65
2 tea 60
3 shampoo 45
4 aspirin 20
5 coke 90
因此,如果有任何情况下可能有重复的条目,则必须使用此方法
df <- data.frame(items, score)
df$score[df$items %in% c("coke", "tea") & df$score < 50] <- 2* df$score[df$items %in% c("coke", "tea") & df$score < 50]
df
items score
1 coke 65
2 tea 60
3 shampoo 45
4 aspirin 20
5 coke 30
发布于 2015-12-16 18:43:34
您的问题不需要if语句。您可以只组合两个逻辑语句。
逻辑1:df$items %in% c("coke", "tea")
逻辑2:df$score < 50
通过过滤这两个逻辑语句上的数据帧,您可以将分数相乘。and= &
,or= |
。
df$score[df$items %in% c("coke", "tea") | df$score < 50] <- 2* df$score[df$items %in% c("coke", "tea") | df$score < 50]
发布于 2015-12-16 18:50:19
items <- c("coke", "tea", "shampoo","aspirin")
score <- as.numeric(c(65,30,45,20))
如果以下面的方式调用data.frame(),就可以避免将分数列转换为因子。
df <- data.frame(items=items,score=score)
您不需要if语句。您可以简单地根据两个逻辑语句提取您感兴趣的值:
df[df$score<50 & df$items %in% c("coke", "tea"), "score"] <- 2 * df[df$score<50 & df$items %in% c("coke", "tea"), "score"]
df$score<50 & df$items %in% c("coke", "tea")
选择与两个条件都匹配的行,即item coke或tea and score less than 50."score"
仅选择score列<-
右侧的语句提取相同的值并将其乘以2。https://stackoverflow.com/questions/34319562
复制