我有以下几点:
x <- c("Sao Paulo - Paulista - SP", "Minas Gerais - Mineiro - MG", "Rio de Janeiro - Carioca -RJ")我想保留"Paulista","Mineiro","Carioca“
我在试着像
y <- gsub("\\$-*","",x)但不起作用。
发布于 2018-11-03 15:41:12
有两种快速方法:
x<- c(" Sao Paulo - Paulista - SP", "Minas Gerais - Mineiro - MG", "Rio de Janeiro - Carioca -RJ")第一个是标准的sub解决方案;如果有没有连字符的字符串,它将返回未修改的完整字符串。
trimws(sub("^[^-]*-([^-]*)-.*$", "\\1", x))
# [1] "Paulista" "Mineiro" "Carioca" 在sub内部
"^[^-]*-([^-]*)-.*$"
^ beginning of each string, avoids mid-string matches
[^-]* matches 0 or more non-hyphen characters
- literal hyphen
([^-]*) matches and stores 0 or more non-hyphen charactesr
- literal hyphen
.* 0 or more of anything (incl hyphens)
5 end of each string
"\\1" replace everything that matches with the stored substring下一个方法是通过"-"将字符串拆分为一个list,然后为第二个元素建立索引。如果有没有连字符的字符串,这将在subscript out of bounds中出错。
trimws(sapply(strsplit(x, "-"), `[[`, 2))
# [1] "Paulista" "Mineiro" "Carioca" 对strsplit的示例调用
strsplit(x[[1]], "-")
# [[1]]
# [1] " Sao Paulo " " Paulista " " SP" ..。因此,第二个元素是Paulista (带有额外的前导/尾随空格)。周围的sapply总是抓取第二个元素(这是字符串不匹配时的错误)。
这两种解决方案都使用trimws来减少前导和尾随空格。
https://stackoverflow.com/questions/53132839
复制相似问题