,这是到目前为止我的代码
pacman::p_load(dplyr, ggplot2, stringr, udpipe, lattice)
gnewsheadlines <- read.csv(file.choose(), stringsAsFactors = F)
udmodel_english <- udpipe_load_model(file = "C:/Users/Palam/Documents/english-ewt-ud-2.5-191206.udpipe")
步骤2 -按日期计算标题总数并绘制要检查的结果
headlinegoogle <- gnewsheadlines %>% filter(date >= "3/31/2022 ", date <= "4/3/2022")
s <- udpipe_annotate(udmodel_english,headlinegoogle$headline)
x <- data.frame(s)
这是我在运行udpipe_annotate时遇到的错误:
Error in `[.data.table`(out, , `:=`(c("token_id", "token", "lemma", "upos", :
Supplied 10 columns to be assigned an empty list (which may be an empty data.table or data.frame since they are lists too). To delete multiple columns use NULL instead. To add multiple empty list columns, use list(list()).
此外:警告信息:
In strsplit(x$conllu, "\\n", fixed = TRUE) : input string 1 is invalid UTF-8
发布于 2022-04-09 15:25:42
看起来头谷歌$标题不是在UTF-8编码。请参阅https://cran.r-project.org/web/packages/udpipe/vignettes/udpipe-tryitout.html
https://stackoverflow.com/questions/71728836
复制相似问题