文章/答案/技术大牛

发布

社区首页 >问答首页 >在python中查找和替换大型文本文件(单行文件或单字符串文件)的最快方法

问在python中查找和替换大型文本文件(单行文件或单字符串文件)的最快方法
EN

Stack Overflow用户

提问于 2021-02-22 07:07:46

回答 2查看 155关注 0票数 0

每一次，我都面临着在python中查找和替换大文本文件(它只是一个单行文件或单个字符串文件)速度慢的问题，这需要花费很多时间来完成任务。我有一个excel文件，其中列"A“的代码在文本文件中可用来替换为"B”列，但代码约为一百万或更多。任何你能推荐的最快的方法。提前谢谢。这两种方法我都试过了。

# first way

import pandas as pd
import re

df = pd.read_excel("rep-codes.xlsx", header=None, index_col=False, dtype=str)
df.columns = ['A', 'B']

for index, row in df.iterrows():
    open_file = open('final.txt', 'r')
    read_file = open_file.read()
    regex = re.compile((row['A']))
    read_file = regex.sub((row['B']), read_file)
    write_file = open('final.txt','w')
    write_file.write(read_file)


# 2nd way

df = pd.read_excel("rep-codes.xlsx", header=None, index_col=False, dtype=str)
df.columns = ['A', 'B']

fin = open("final.txt", "rt")
data = fin.read()

for index, row in df.iterrows():
    data = data.replace((row['A']), (row['B']))

fin.close()
fin = open("final.txt", "wt")
fin.write(data)
fin.close()

python

pandas

dataframe

replace

Stack Overflow用户

发布于 2021-02-23 00:50:13

如果.txt文件只有一列数据，那么操作就应该像这样简单；

df = pd.read_excel("rep-codes.xlsx", header=None, index_col=False, dtype=str)
df.columns = ['A', 'B']

df['B'].to_csv('final.txt')

如果.txt文件是多列，您只需要将a列的值与b列的值进行交换；

df = pd.read_excel("rep-codes.xlsx", header=None, index_col=False, dtype=str)
df.columns = ['A', 'B']

txt_df = pd.read_csv('final.txt')
txt_df['A']=df['B']
txt_df.to_csv('final.txt')

我还会猜测，还有一些其他因素没有提到，比如不同的列大小等等。如果需要，请让我知道还需要更改哪些内容。

票数 0

查看全部 2 条回答

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/66308127

复制

相似问题

问在python中查找和替换大型文本文件(单行文件或单字符串文件)的最快方法
EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在python中查找和替换大型文本文件(单行文件或单字符串文件)的最快方法EN

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在python中查找和替换大型文本文件(单行文件或单字符串文件)的最快方法
EN