首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >向文本文件中相同的行添加标识符

向文本文件中相同的行添加标识符
EN

Stack Overflow用户
提问于 2019-07-11 02:51:25
回答 3查看 41关注 0票数 2

我想为文本文件中相同的连续行添加一个标识符。例如,我有以下输入文件:

Apple
Apple
Apple
Banana
Banana
Pineapple
Pineapple
Pineapple
Pineapple

我希望我的输出是这样的:

Apple_number_1
Apple_number_2
Apple_number_3
Banana_number_1
Banana_number_2
Pineapple_number_1
Pineapple_number_2
Pineapple_number_3
Pineapple_number_4

我有一段代码,如果当前行和前一行相同,就会打印一行:

my_file=open('/Users/Jo/Desktop/for_building.txt')
lines=my_file.readlines()

def lines_equal(curr_line, prev_line, compare_char):
   curr_line_parts = curr_line.split(' ')
   prev_line_parts = prev_line.split(' ')
   for item in zip(curr_line_parts, prev_line_parts):
       if item[0].startswith(compare_char):
           return item[0] == item[1]

results = []
prev_line = lines[0]

for line in lines[1:]:
    results.append(lines_equal(line, prev_line, 'Z'))
    prev_line = line
    print(prev_line)

如何在末尾添加标识符?我想我将使用一个while循环。如果while循环在for循环中被捕获,就会变得很棘手。有什么聪明的办法可以解决这个问题吗?

EN

回答 3

Stack Overflow用户

发布于 2019-07-11 02:58:25

我会使用一个默认的dict,它会保存每一行的计数,从零(默认)开始,并在每次对同一行进行编码时递增:

from collections import defaultdict

lineCounts = defaultdict(int)

for line in lines:
    lineCounts[line] = lineCounts[line] + 1
    print('{}_Number_{}'.format(line, lineCounts[line])
票数 4
EN

Stack Overflow用户

发布于 2019-07-11 03:45:29

from itertools import groupby

with open("data.txt", "r") as file:
    lines = file.read().splitlines()

groups = [list(group) for _, group in groupby(lines)]

for group in groups:
    for index, fruit in enumerate(group, start=1):
        print(f"{fruit}_number_{index}")

输出:

Apple_number_1
Apple_number_2
Apple_number_3
Banana_number_1
Banana_number_2
Pineapple_number_1
Pineapple_number_2
Pineapple_number_3
Pineapple_number_4
票数 2
EN

Stack Overflow用户

发布于 2019-07-11 03:06:49

简单的迭代方法:

with open('file.txt') as f:
    cnt = 1   # initial counter value
    prev_line = None
    for line in f:
        if prev_line and line != prev_line: cnt = 1   # resetting counter
        print('{}_number_{}'.format(line.strip(), cnt))
        prev_line = line
        cnt += 1

输出:

Apple_number_1
Apple_number_2
Apple_number_3
Banana_number_1
Banana_number_2
Pineapple_number_1
Pineapple_number_2
Pineapple_number_3
Pineapple_number_4
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/56976962

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档