文章/答案/技术大牛

发布

社区首页 >问答首页 >使用python熊猫将文本数据拆分为行数据记录

问使用python熊猫将文本数据拆分为行数据记录
EN

Stack Overflow用户

提问于 2021-12-15 12:58:39

回答 2查看 210关注 0票数 2

因为我对python很陌生，因为我试图拆分文本数据，并将其转换为excel列和行记录。假设我有100条记录，因为我需要拆分为1-7是一列，8-8是第二列，9-10是第三列，11-18是第四列，第五列是19-24，第六列是25-124，第七列是125-1000。下面的示例记录在text.txt中。我想转换成基于上述字符的excel文件。有谁能帮我吗。

示例文本格式:

 animals210 redwingsclearmist
 animals220 redwingsclearmist
 animals230 redwingsclearmist
 animals240 redwingsclearmist

输出格式示例:

    0        1      2     3      4
 0  animals   210    red   wings  clearmist
 1 animals   210    red   wings  clearmist
 2 animals   210    red   wings  clearmist
 3  animals   210    red   wings  clearmist

python

pandas

numpy

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-12-15 13:17:50

您可以将itertools.tee和zip_longest结合起来

功能拆分：

from itertools import tee, zip_longest

def split_by_index(s):
  indices = [0,7,10,14,20]
  start, end = tee(indices)
  next(end)
  return " ".join([s[i:j] for i,j in zip_longest(start, end)])

你的数据：

import pandas as pd

df = pd.DataFrame()
df["sentence"] = ["animals120 redlivinginjungle",
                  "animals140 redlivinginjungle",
                  "animals160 redlivinginjungle"]


    sentence
0   animals120 redlivinginjungle
1   animals140 redlivinginjungle
2   animals160 redlivinginjungle

然后应用函数创建新的dataframe：

new_df = df["sentence"].apply(split_by_index).str.split(expand=True)

输出

print(new_df)

    0       1   2   3       4
0   animals 120 red living  injungle
1   animals 140 red living  injungle
2   animals 160 red living  injungle

票数 0

Stack Overflow用户

发布于 2021-12-15 13:00:51

使用.str访问器

column_splits = {'first': [0, 7], 'second': [7, 10]}

for column, limits in column_splits.items():
    start, end = limits
    df[column] = df['your_column'].str[start: end]

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70364112

复制

相似问题

问使用python熊猫将文本数据拆分为行数据记录
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用python熊猫将文本数据拆分为行数据记录EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用python熊猫将文本数据拆分为行数据记录
EN