文章/答案/技术大牛

发布

问用列表压缩跳过2行
EN

Stack Overflow用户

提问于 2017-08-07 12:03:23

回答 2查看 68关注 0票数 2

我试图利用列表理解从一个非常大的文件中排序数据。文件结构如下所示：

THING
info1
info2
info3
THING
info1
info2
info3

..。诸若此类。

基本上，尝试将所有info1收集到一个列表中，将所有info2收集到另一个列表中。我有一个以前的脚本来做这个，但它很慢。我还试图使它面向对象，以便更有效地使用数据。

旧剧本：

info1_data = []
info2_data = []
with open(myfile) as f:
    for line in f:
        if re.search('THING',line):
            line=next(f)
            info1_data.append(line)
            line=next(f)
            info2_data.append(line)

新脚本：

def __init__(self, file):
    self.file = file

def sort_info1(self):
    with self.file as f:
        info1_data = [next(f) for line in f if re.search('THING',line)]
    return info1_data

def sort_info2(self):
    with self.file as f:
        info2_data = [next(f).next(f) for line in f if re.search('THING',line)]
    return info2_data

新脚本用于将info1_data作为列表获取。但是，为了获得info2_data，我无法找到使用这种方法跳过2行的任何内容。我猜到了next(f).next(f)。它运行但没有产生任何东西。

这个是可能的吗？

非常感谢。

在摩西的帮助下，我有了这个解决方案。岛上的情况非常混乱，即使在阅读了python.docs之后，我也不完全理解。可迭代获取数据(即info1或info2)，还是从开始、停止和步骤中决定提取什么数据？

岛(可迭代，启动，停止，步骤)

from itertools import islice
import re

class SomeClass(object):
    def __init__(self, file):
        self.file = file

    def search(self, word, i):
        self.file.seek(0) # seek to start of file
        for line in self.file:
            if re.search(word, line) and i == 0:
                line = next(self.file)
                yield line
            elif re.search(word, line) and i == 1:
                line = next(self.file)
                line = next(self.file)
                yield line

    def sort_info1(self):
        return list(islice(self.search('THING',0), 0, None, 2))

    def sort_info2(self):
        return list(islice(self.search('THING',1), 2, None, 2))


info1 = SomeClass(open("test.dat")).sort_info1()
info2 = SomeClass(open("test.dat")).sort_info2()

python

list-comprehension

回答 2

Stack Overflow用户

回答已采纳

发布于 2017-08-07 12:20:25

您应该将文件seek返回到开始，以便从文件开始时重复搜索。此外，还可以使用生成器函数将搜索操作与数据的生成分离开来。然后使用itertools.islice跨行：

from itertools import islice

class SomeClass(object):
    def __init__(self, file):
        self.file = file

    def search(self, word):
        self.file.seek(0) # seek to start of file
        for line in self.file:
            if re.search(word, line):
                # yield next two lines
                yield next(self.file)
                yield next(self.file)

    def sort_info1(self):
        return list(islice(self.search('THING'), 0, None, 2))

    def sort_info2(self):
        return list(islice(self.search('THING'), 1, None, 2))

但是，我建议您不要传递文件，而是将路径传递到文件，这样每次使用后文件都可以被关闭，以避免在不需要(或尚未)资源时占用资源。

票数 2

Stack Overflow用户

发布于 2017-08-07 12:43:57

你可以这样做：

def sort_info2(self):
    with self.file as f:
        info2_data = [(next(f),next(f))[1] for line in f if re.search('THING',line)]
    return info2_data

但看起来有点奇怪！

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/45546457

复制

相似问题

问用列表压缩跳过2行
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问用列表压缩跳过2行EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问用列表压缩跳过2行
EN