首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >如果有任何函数在数据帧上的循环中

如果有任何函数在数据帧上的循环中
EN

Stack Overflow用户
提问于 2018-03-19 18:15:37
回答 3查看 421关注 0票数 0

我在if/else语句中使用字符串的any函数有问题。注意,函数中的打印("A")只是一个例子。如果列包含某些值,则需要执行一系列操作。

随机生成数据

代码语言:javascript
运行
复制
level=c("Strongly Agree", "Agree", "Neither agree or disagree","Disagree", "Strongly disagree",NA)
df <- data.frame(pre_1=as.character(sample(c("Yes","No", NA), 30, replace = T)), 
                 pre_2=as.character(sample(level, 30, replace = T)),
                 post_1=as.character(sample(level, 30, replace = T)),
                 post_2=as.character(sample(c("<90%", "0-80%", ">90", NA), 30, replace = T)),
                 stringsAsFactors=T)

选择所需的数据部分("post_")并根据特定列的值打印一条语句。在本例中,我需要为包含特定行值的列打印"A“:"Strongly Agree", "Agree", "Neither agree or disagree","Disagree", "Strongly disagree"

代码语言:javascript
运行
复制
select(df, starts_with("post_")) %>% 
  length() %>% 
  seq(1,.,1)  %>% 
  for (i in .){
      if (any(c("Neither agree or disagree") == (select(df, starts_with("post_"))[i]))){
        print ("A")
      } else {print ("B")}
    } 

这就给出了错误

代码语言:javascript
运行
复制
Error in if (any(c("Neither agree or disagree") == (select(df, starts_with("post_"))[i]))) { : 
  missing value where TRUE/FALSE needed

请注意,如果我在这里运行的代码正确工作。

代码语言:javascript
运行
复制
if (any(c("Neither agree or disagree","Agree") == df[3])){print ("A")} else {
  print ("B")}

感谢你的任何帮助

EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2018-03-20 10:10:20

我使用了@tobiaspk1 1的建议,并使用rowSums对列的某些值进行条件设置。问题是,我希望包含更多的条件(每列的所有因素),以确保该函数在其他情况下工作(例如,当中间类别缺失时)。

代码语言:javascript
运行
复制
dfplot <- function(df,prefix){
  select(df, starts_with(prefix)) %>% 
    length() %>% 
    seq(1,.,1)  %>% 
    for (i in .){
      if (dummy(as.character(select(data, starts_with(prefix))[[i]])) == FALSE) {
        if (colSums(select(df, starts_with(prefix))[i] == "Agree", na.rm = TRUE) > 0){
          factor(select(data, starts_with(prefix))[[i]], c("Strongly Agree", "Agree", "Neither agree or disagree","Disagree", "Strongly disagree"),ordered = T ) %>% 
            data.frame() %>%
            na.omit() %>%
            ggplot(.,aes(x=.))  +  
            geom_bar(aes(y = (..count..)/sum(..count..)), stat="count") + 
            geom_text(aes( label =paste(round((..count..)/sum(..count..)*100),"%"), y= (..count..)/sum(..count..)), stat= "count", vjust = -.5)+
            scale_y_continuous(labels=percent,limits = c(-0, 1)) + 
            scale_x_discrete(drop=FALSE) + 
            ylab("Relative Frequencies (%)")+
            ggtitle(names(select(data, starts_with(prefix)))[i]) +
            theme_light(base_size = 12) +
            theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
            theme(plot.title = element_text(hjust = 0.5,size = 10))-> agreeplot
          print(agreeplot)} 
        else if (colSums(select(df, starts_with(prefix))[i] == "51-75%", na.rm = TRUE) > 0) {
          factor(select(data, starts_with(prefix))[[i]], c("1-25%", "26-50%", "51-75%", "75-90%","91-100%"),ordered = T ) %>% 
            data.frame() %>%
            na.omit() %>%
            ggplot(.,aes(x=.))  +  
            geom_bar(aes(y = (..count..)/sum(..count..)), stat="count") + 
            geom_text(aes( label =paste(round((..count..)/sum(..count..)*100),"%"), y= (..count..)/sum(..count..)), stat= "count", vjust = -.5)+
            scale_y_continuous(labels=percent,limits = c(-0, 1)) + 
            scale_x_discrete(drop=FALSE) + 
            ylab("Relative Frequencies (%)")+
            ggtitle(names(select(data, starts_with(prefix)))[i]) +
            theme_light(base_size = 12) +
            theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
            theme(plot.title = element_text(hjust = 0.5,size = 10))-> numplot
          print(numplot)}
        else if(colSums(select(df, starts_with(prefix))[i] == "Somewhat too easy", na.rm = TRUE) > 0) {
          factor(select(data, starts_with(prefix))[[i]], c("Very easy", "Somewhat too easy", "About right", "Somewhat challenging","Very challenging"),ordered = T ) %>% 
            data.frame() %>%
            na.omit() %>%
            ggplot(.,aes(x=.))  +  
            geom_bar(aes(y = (..count..)/sum(..count..)), stat="count") + 
            geom_text(aes( label =paste(round((..count..)/sum(..count..)*100),"%"), y= (..count..)/sum(..count..)), stat= "count", vjust = -.5)+
            scale_y_continuous(labels=percent,limits = c(-0, 1)) + 
            scale_x_discrete(drop=FALSE) + 
            ylab("Relative Frequencies (%)")+
            ggtitle(names(select(data, starts_with(prefix)))[i]) +
            theme_light(base_size = 12) +
            theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
            theme(plot.title = element_text(hjust = 0.5,size = 10))-> aboutplot
          print(aboutplot)}
        else if(colSums(select(df, starts_with(prefix))[i] == "Too slow", na.rm = TRUE) > 0) {
          factor(select(data, starts_with(prefix))[[i]], c("Too slow", "Slow", "About right", "Fast","Too fast"),ordered = T ) %>% 
            data.frame() %>%
            na.omit() %>%
            ggplot(.,aes(x=.))  +  
            geom_bar(aes(y = (..count..)/sum(..count..)), stat="count") + 
            geom_text(aes( label =paste(round((..count..)/sum(..count..)*100),"%"), y= (..count..)/sum(..count..)), stat= "count", vjust = -.5)+
            scale_y_continuous(labels=percent,limits = c(-0, 1)) + 
            scale_x_discrete(drop=FALSE) + 
            ylab("Relative Frequencies (%)")+
            ggtitle(names(select(data, starts_with(prefix)))[i]) +
            theme_light(base_size = 12) +
            theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
            theme(plot.title = element_text(hjust = 0.5,size = 10))-> rightplot
          print(rightplot)}
        else if(colSums(select(df, starts_with(prefix))[i] == "Between 3 and 4 hours", na.rm = TRUE) > 0) {
          factor(select(data, starts_with(prefix))[[i]], c("Less than 2 hours", "Between 2 and 3 hours", "Between 3 and 4 hours", "Between 4 and 5 hours","More than 5 hours"),ordered = T ) %>% 
            data.frame() %>%
            na.omit() %>%
            ggplot(.,aes(x=.))  +  
            geom_bar(aes(y = (..count..)/sum(..count..)), stat="count") + 
            geom_text(aes( label =paste(round((..count..)/sum(..count..)*100),"%"), y= (..count..)/sum(..count..)), stat= "count", vjust = -.5)+            
            scale_y_continuous(labels=percent,limits = c(-0, 1)) + 
            scale_x_discrete(drop=FALSE) + 
            ylab("Relative Frequencies (%)")+
            ggtitle(names(select(data, starts_with(prefix)))[i]) +
            theme_light(base_size = 12) +
            theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
            theme(plot.title = element_text(hjust = 0.5,size = 10))-> hoursplot
          print(hoursplot)}
        else {data.frame(select(df, starts_with(prefix))[[i]])  %>%
            na.omit() %>%
            ggplot(.,aes(x=.))  +  
            geom_bar(aes(y = (..count..)/sum(..count..)), stat="count") + 
            geom_text(aes( label =paste(round((..count..)/sum(..count..)*100),"%"), y= (..count..)/sum(..count..)), stat= "count", vjust = -.5)+            
            scale_y_continuous(labels=percent,limits = c(-0, 1)) + 
            scale_x_discrete(drop=FALSE) + 
            ylab("Relative Frequencies (%)")+
            ggtitle(names(select(data, starts_with(prefix)))[i]) +
            theme_light(base_size = 12) +
            theme(axis.text.x = element_text(angle = 45, hjust = 1)) + 
            theme(plot.title = element_text(hjust = 0.5,size = 10))-> elseplot
          print(elseplot)}}
      else {data.frame(select(df, starts_with(prefix))[[i]])  %>%
          na.omit() %>%
          ggplot(.,aes(x=.))  +  
          geom_bar(aes(y = (..count..)/sum(..count..)), stat="count") + 
          geom_text(aes( label =paste(round((..count..)/sum(..count..)*100),"%"), y= (..count..)/sum(..count..)), stat= "count", vjust = -.5)+
          scale_y_continuous(labels=percent,limits = c(-0, 1)) + 
          scale_x_discrete(drop=FALSE) + 
          ylab("Relative Frequencies (%)") + 
          ggtitle(names(select(df, starts_with(prefix)))[i]) +
          theme_light(base_size = 12) +
          theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
          theme(plot.title = element_text(hjust = 0.5,size = 10)) -> dummyplot
        print(dummyplot)}
    }  
}
票数 0
EN

Stack Overflow用户

发布于 2018-03-19 18:34:10

尽可能避免循环。R的强度是矢量计算!

尝试以下几点

代码语言:javascript
运行
复制
results <- character(nrow(df)) # initialise vector
results <- "B" # set B as default

at_least_one <- rowSums(df == "Strongly Agree", na.rm = TRUE) > 0 # find the rows that contain the word searched at least once
results[at_least_one] <- "A" # change those that contain the word to "A"

您可以循环遍历您的值,比如“强烈同意”、“同意”和覆盖结果向量!希望这能帮上忙!

票数 0
EN

Stack Overflow用户

发布于 2018-03-19 20:14:04

T/F索引到LETTERS的方法有点幼稚: 1.按grepl选择落入模式colptrn中的列;2.将df转到列表中;3.通过列表项进行sapply,并将它们与清单进行比较;4.如果有TRUE,则FALSE + 1 = "A",如果没有,则TRUE + 1 = "B"

代码语言:javascript
运行
复制
fu <- function(df, i, colptrn, na.rm = T){
    sapply(as.list(df[grepl(colptrn, colnames(df))]), 
           function(li) LETTERS[1 + !any(i %in% li, na.rm = na.rm)]
           )
    }

## Test 
fu(df, c("Neither agree or disagree", "Agree"), "post_")
post_1 post_2 
   "A"    "B" 
fu(df, c("Neither agree or disagree", "Agree"), ".*")
 pre_1  pre_2 post_1 post_2 
   "B"    "A"    "A"    "B" 
fu(df, c("Neither agree or disagree", "Agree"), "postpostup")
named list()
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/49369804

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档