首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >如何将txt文件导入到R中,并根据特定条件将文本分隔为多个列

如何将txt文件导入到R中,并根据特定条件将文本分隔为多个列
EN

Stack Overflow用户
提问于 2021-01-28 05:47:34
回答 1查看 23关注 0票数 0

我有一些工作描述保存为txt文件格式。职位名称,职位描述,职位名称等都集中在一起,我试图将它们分成几列。正文大约有5页长。以下是文本结构的示例-

代码语言:javascript
运行
复制
EXECUTIVE LEVEL
001 Chief Executive Officer: Job description of CEO.
040 Area Director: This line contains job description of the Area Director.

FINANCE TEAM
025 Chief Operating Officer: This line contains job description of the Chief Operating Officer
055 Chief Financial Officer: This person controls operations of the company and reports to the COO

MARKETING TEAM
056 Marketing Director: This person is in charge of the marketing team. Blab la bla

我想创建一个dataframe (或者现在叫tibble?)有4列-

第1列-团队名称(高管级别、财务团队、营销团队等)

第2列-团队编号(001,040 025,055等)

第3栏-职位(首席执行官、首席运营官等)

第4列-工作描述

提前感谢

EN

Stack Overflow用户

发布于 2021-01-28 06:25:33

代码语言:javascript
运行
复制
x2 <- x[nzchar(x)]
x3 <- split(x2, cumsum(grepl("^[A-Z]", x2)))
x4 <- lapply(x3, function(z) transform(strcapture("^([0-9]+)\\s+([^:]+):\\s*(.*)$", z[-1], list(num="", title="", desc="")), name=z[1]))
x5 <- do.call(rbind, x4)
x5
#     num                   title                                                                  desc            name
# 1.1 001 Chief Executive Officer                                               Job description of CEO. EXECUTIVE LEVEL
# 1.2 040           Area Director              This line contains job description of the Area Director. EXECUTIVE LEVEL
# 2.1 025 Chief Operating Officer     This line contains job description of the Chief Operating Officer    FINANCE TEAM
# 2.2 055 Chief Financial Officer This person controls operations of the company and reports to the COO    FINANCE TEAM
# 3   056      Marketing Director           This person is in charge of the marketing team. Blab la bla  MARKETING TEAM

数据,很可能是x <- readLines(path_to_file)的结果。

代码语言:javascript
运行
复制
x <- c("EXECUTIVE LEVEL", "001 Chief Executive Officer: Job description of CEO.", "040 Area Director: This line contains job description of the Area Director.", "", "FINANCE TEAM", "025 Chief Operating Officer: This line contains job description of the Chief Operating Officer", "055 Chief Financial Officer: This person controls operations of the company and reports to the COO", "", "MARKETING TEAM", "056 Marketing Director: This person is in charge of the marketing team. Blab la bla")
票数 1
EN
查看全部 1 条回答
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/65927895

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档