首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >从部分匹配变量值的向量返回字符串

从部分匹配变量值的向量返回字符串
EN

Stack Overflow用户
提问于 2018-06-05 02:13:29
回答 1查看 48关注 0票数 0

我有一个字符串向量:

代码语言:javascript
复制
keywords <- c("kw 1", "kw2", "kw3", "kw4", "kw5", "kw6", "kw7", "kw8", 
              "kw 9 kw", "kw10", "kw11", "kw12", "kw13", "kw14", "kw15")

和具有空列关键字的数据框:

代码语言:javascript
复制
df <- data.frame("Description" = c("blabla kw10", "blabla kw15","blabla kw 1", 
                                   "blabla kw13", "blabla kw7", "kw2 bla", "kw8 blabla","bla kw11 bla", 
                                   "blabla kw10","blakw 9 kw", "blablakw4", "blakw 1bla"),
                 "Keyword" = NA)

我需要找到一种方法来查找关键字向量中的字符串,该字符串与描述变量中的值部分匹配,并从关键字向量中返回该匹配字符串作为df数据帧中关键字列的值。

我需要此结果

代码语言:javascript
复制
df <- data.frame("Description" = c("blabla kw10", "blabla kw15","blabla kw 1", 
                                   "blabla kw13", "blabla kw7", "kw2 bla", "kw8 blabla","bla kw11 bla", 
                                   "blabla kw10","blakw 9 kw", "blablakw4", "blakw 1bla"),
                 "Keyword" = c("kw10", "kw15", "kw 1", "kw13", "kw7", "kw2", "kw8", "kw11", "kw10", "kw 9 kw", "kw4", "kw 1"))

您能为此提出任何解决方案吗?

编辑:

keywords2矢量和df2数据帧的可重现示例:

代码语言:javascript
复制
keywords2 <- c("cartucho", "MOLDE", "FILTRO", "BOMBA", "MOTOR")

df2 <- data.frame("Description" = c("CULATA PARA MOTOR", "BOMBA CENTRIFUGA PARA LIQUIDOS", 
    " CARTUCHO FILTRANTE", "APARATO FILTRO MONITOR", "MOLDES PARA QUESO", 
    "BOMBA PERISTALTICA", "MOLDE CON TAPA Y DESUERADOR", 
    "APARATO FILTRO DE MEMBRANA", "BOMBA DE VACIO"),
              "Keyword" = NA)

预期结果:

代码语言:javascript
复制
    df2 <- data.frame("Description" = c("CULATA PARA MOTOR", "BOMBA CENTRIFUGA PARA LIQUIDOS", 
" CARTUCHO FILTRANTE", "APARATO FILTRO MONITOR", "MOLDES PARA QUESO", 
"BOMBA PERISTALTICA", "MOLDE CON TAPA Y DESUERADOR", 
"APARATO FILTRO DE MEMBRANA", "BOMBA DE VACIO"),
"Keyword" = c("MOTOR", "BOMBA", "cartucho", "FILTRO", "MOLDE", "BOMBA", "MOLDE", "FILTRO", "BOMBA")
EN

回答 1

Stack Overflow用户

发布于 2018-06-05 02:15:40

我们可以使用str_extract

代码语言:javascript
复制
library(stringr)
df$Keyword <- str_extract(df$Description, paste(keywords, collapse='|'))
df$Keyword
#[1] "kw10"    "kw15"    "kw 1"    "kw13"    "kw7"     "kw2"     "kw8"    
#[8] "kw11"    "kw10"    "kw 9 kw" "kw4"     "kw 1"   

更新

使用新的数据集和关键字,将'keywords2‘转换为大写,然后将其paste在一起作为str_extractpattern

代码语言:javascript
复制
str_extract(df2$Description, paste(toupper(keywords2), collapse="|"))
#[1] "MOTOR"    "BOMBA"    "CARTUCHO" "FILTRO"   "MOLDE"    "BOMBA"    "MOLDE"   
#[8] "FILTRO"   "BOMBA"   
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/50686493

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档