文章/答案/技术大牛

发布

社区首页 >问答首页 >Python转到文本文件行，而不读取前面的行

问Python转到文本文件行，而不读取前面的行
EN

Stack Overflow用户

提问于 2015-06-30 03:35:40

回答 4查看 758关注 0票数 2

我正在处理一个非常大的文本文件(tsv)，大约有2亿个条目。其中一列是date，记录按date排序。现在，我想开始读取给定日期的记录。目前我只是从头开始阅读，速度非常慢，因为我需要阅读近100-1.5亿条记录才能达到这个记录。我在想，如果我可以使用二进制搜索来加快速度，我可以在最多28次额外的记录读取(log(2亿))。python允许读取第n行而不缓存或读取之前的行吗？

python

回答 4

Stack Overflow用户

发布于 2015-06-30 03:43:39

如果文件不是固定长度的，那你就倒霉了。某些函数将不得不读取该文件。如果文件是固定长度的，可以使用函数file.seek(line*linesize)打开该文件。然后从那里读取文件。

票数 2

Stack Overflow用户

发布于 2015-06-30 03:41:35

如果要读取的文件很大，并且您不想一次读取内存中的整个文件：

fp = open("file")
for i, line in enumerate(fp):
    if i == 25:
        # 26th line
    elif i == 29:
        # 30th line
    elif i > 29:
        break
fp.close()

请注意，第n行的i == n-1。

票数 0

Stack Overflow用户

发布于 2015-06-30 04:13:32

您可以使用fileObject.seek(offset[, whence])方法

#offset -- This is the position of the read/write pointer within the file.

#whence -- This is optional and defaults to 0 which means absolute file positioning, other values are 1 which means seek relative to the current position and 2 means seek relative to the file's end.


file = open("test.txt", "r")
line_size = 8 # Because there are 6 numbers and the newline
line_number = 5
file.seek(line_number * line_size, 0)
for i in range(5):
    print(file.readline())
file.close()

对于这段代码，我使用了下面的文件：

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/31124088

复制

相似问题

问Python转到文本文件行，而不读取前面的行
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python转到文本文件行，而不读取前面的行EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python转到文本文件行，而不读取前面的行
EN