我有一个28行的tibble:
> al
# A tibble: 28 x 1
lang_name
<chr>
1 Objective-C,Swift,Other
2 Ruby,Shell
3 Ruby,HTML,Shell
4 Java,HTML,Kotlin,Other
5 TypeScript,JavaScript,CSS,Inno Setup,Shell,HTML
6 Vue,JavaScript,CSS,HTML
7 HTML,JavaScript,CSS
8 JavaScript,HTML,CSS,Other
9 NA
10 Vim script,Ruby,Shell,Python,CoffeeScript,Makefile,Other
# ... with 18 more rows
我用al <- gh[,'lang_name']
切开另一个数据帧得到的结果。我想从每一行中提取数据,并将其全部放在一个列表中,这样我就可以找到唯一的值。
我该怎么做?
我尝试过使用al <- str_split(al, ",")
拆分,但它返回以下列表:
[[1]]
[1] "c(\"Objective-C" "Swift" "Other\"" " \"Ruby"
[5] "Shell\"" " \"Ruby" "HTML" "Shell\""
[9] " \"Java" "HTML" "Kotlin" "Other\""
[13] " \"TypeScript" "JavaScript" "CSS" "Inno Setup"
[17] "Shell" "HTML\"" " \"Vue" "JavaScript"
[21] "CSS" "HTML\"" " \"HTML" "JavaScript"
[25] "CSS\"" " \"JavaScript" "HTML" "CSS"
[29] "Other\"" " NA" " \"Vim script" "Ruby"
[33] "Shell" "Python" "CoffeeScript" "Makefile"
[37] "Other\"" " \"PHP\"" " \"JavaScript" "TypeScript"
[41] "Other\"" " \"JavaScript" "Other\"" " \"JavaScript"
[45] "CSS" "Shell\"" " \"Ruby" "JavaScript"
[49] "HTML" "Vue" "CSS" "Shell\""
[53] " \"Go" "Assembly" "HTML" "C"
[57] "Shell" "Perl\"" " \"Go" "HCL"
[61] "Other\"" " \"JavaScript\"" " \"C++" "JavaScript"
[65] "Python" "Go" "Shell" "C\""
[69] " \n\"JavaScript" "CSS" "HTML" "Other\""
[73] " \"C++" "Cuda" "C" "CMake"
[77] "Java" "Python" "Other\"" " \"JavaScript"
[81] "GLSL\"" " \"JavaScript" "TypeScript" "CSS\""
[85] " \"Kotlin" "C" "Makefile" "HTML"
[89] "C++" "Java" "Other\"" " \"Java"
[93] "Other\"" " \"Python" "Jupyter Notebook" "C++"
[97] "HTML" "Shell" "JavaScript\"" " \"CSS"
[101] "JavaScript" "HTML" "Other\"" " \"HTML"
[105] "CSS" "JavaScript\")"
而unique(al)
只是返回相同的字符串。
我也试着把这一切都放在一个角色上:
al <- gh[1,'lang_name']
i = 2
while(i < nrow(gh)) {
al <- paste(al, ",", gh[i+1,'lang_name'])
i = i + 1
}
}
这将导致以下字符:[1] "Objective-C,Swift,Other , Ruby,HTML,Shell , Java,HTML,Kotlin,Other , TypeScript,JavaScript,CSS,Inno Setup,Shell,HTML , Vue,JavaScript,CSS,HTML , HTML,JavaScript,CSS , JavaScript,HTML,CSS,Other , NA , Vim script,Ruby,Shell,Python,CoffeeScript,Makefile,Other , PHP , JavaScript,TypeScript,Other , JavaScript,Other , JavaScript,CSS,Shell , Ruby,JavaScript,HTML,Vue,CSS,Shell , Go,Assembly,HTML,C,Shell,Perl , Go,HCL,Other , JavaScript , C++,JavaScript,Python,Go,Shell,C , JavaScript,CSS,HTML,Other , C++,Cuda,C,CMake,Java,Python,Other , JavaScript,GLSL , JavaScript,TypeScript,CSS , Kotlin,C,Makefile,HTML,C++,Java,Other , Java,Other , Python,Jupyter Notebook,C++,HTML,Shell,JavaScript , CSS,JavaScript,HTML,Other , HTML,CSS,JavaScript"
我不知道如何将其转换为运行unique
的字符串。
发布于 2018-06-04 09:08:42
如果您喜欢tidyverse
/purrr
函数,可以在一个管道步骤中完成此操作。stringr::str_split
是stringi::stri_split
的便捷包装器。purrr::reduce
允许您重复应用函数(在本例中为c
),直到将str_split
返回的整个向量列表缩减为一个字符向量。对于像这样的任务,从R开始的unlist
也可以很好地取代reduce
-I have very purrr
-focused的习惯,但这不一定是一个简单任务的默认设置。
library(tidyverse)
al$lang_name %>%
str_split(",") %>%
reduce(c) %>%
unique()
#> [1] "Objective-C" "Swift" "Other" "Ruby"
#> [5] "Shell" "HTML" "Java" "Kotlin"
#> [9] "TypeScript" "JavaScript" "CSS" "Inno Setup"
#> [13] "Vue" NA "Vim script" "Python"
#> [17] "CoffeeScript" "Makefile"
由reprex package创建于2018-06-03 (v0.2.0)。
发布于 2018-06-04 07:07:21
我希望这能给你想要的:
library(tibble)
al <- tibble(lang_name=
c("Objective-C,Swift,Other",
"Ruby,Shell",
"Ruby,HTML,Shell",
"Java,HTML,Kotlin,Other",
"TypeScript,JavaScript,CSS,Inno Setup,Shell,HTML",
"Vue,JavaScript,CSS,HTML",
"HTML,JavaScript,CSS",
"JavaScript,HTML,CSS,Other",
NA,
"Vim script,Ruby,Shell,Python,CoffeeScript,Makefile,Other"))
l1 <- strsplit(al$lang_name,",")
l1
# [[1]]
# [1] "Objective-C" "Swift" "Other"
#
# [[2]]
# [1] "Ruby" "Shell"
#
# [[3]]
# [1] "Ruby" "HTML" "Shell"
#
# [[4]]
# [1] "Java" "HTML" "Kotlin" "Other"
#
# [[5]]
# [1] "TypeScript" "JavaScript" "CSS" "Inno Setup" "Shell" "HTML"
#
# [[6]]
# [1] "Vue" "JavaScript" "CSS" "HTML"
#
# [[7]]
# [1] "HTML" "JavaScript" "CSS"
#
# [[8]]
# [1] "JavaScript" "HTML" "CSS" "Other"
#
# [[9]]
# [1] NA
#
# [[10]]
# [1] "Vim script" "Ruby" "Shell" "Python" "CoffeeScript" "Makefile" "Other"
l2 <- unlist(l1)
l2
# [1] "Objective-C" "Swift" "Other" "Ruby" "Shell" "Ruby" "HTML" "Shell"
# [9] "Java" "HTML" "Kotlin" "Other" "TypeScript" "JavaScript" "CSS" "Inno Setup"
# [17] "Shell" "HTML" "Vue" "JavaScript" "CSS" "HTML" "HTML" "JavaScript"
# [25] "CSS" "JavaScript" "HTML" "CSS" "Other" NA "Vim script" "Ruby"
# [33] "Shell" "Python" "CoffeeScript" "Makefile" "Other"
l3 <- unique(l2)
l3
# [1] "Objective-C" "Swift" "Other" "Ruby" "Shell" "HTML" "Java" "Kotlin"
# [9] "TypeScript" "JavaScript" "CSS" "Inno Setup" "Vue" NA "Vim script" "Python"
# [17] "CoffeeScript" "Makefile"
https://stackoverflow.com/questions/50670770
复制相似问题