问Python将字符串拆分为下一个句点标点符号
EN

Stack Overflow用户

提问于 2019-05-21 02:52:18

回答 4查看 571关注 0票数 1

每8个单词后拆分字符串。如果第8个单词没有(.或者!)，移动到下一个可以这样做的单词。

我可以把单词和字符串分开。

with open("file.txt") as c:
    for line in c:
        text = line.split()
        n = 8
        listword = [' '.join(text[i:i+n]) for i in range(0,len(text),n)]
        for lsb in listword:
            print(lsb)

预期输出应为

I'm going to the mall for breakfast, Please meet me there for lunch. 
The duration of the next. He figured I was only joking!
I brought back the time.

这就是我所得到的

I'm going to the mall for breakfast, Please
meet me there for lunch. The duration of 
the next. He figured I was only joking!
I brought back the time.

python

回答 4

Stack Overflow用户

回答已采纳

发布于 2019-05-21 03:19:07

您正在向单词序列添加换行符。换行符的主要条件是最后一个单词以.或!结尾。加上关于最小长度(8个单词或更多)的次要条件。下面的代码收集缓冲区中的单词，直到满足打印一行的条件。

with open("file.txt") as c:
    out = []
    for line in c:
        for word in line.split():
            out.append(word)
            if word.endswith(('.', '!')) and len(out) >= 8:
                print(' '.join(out))
                out.clear()
    # don't forget to flush the buffer
    if out:
        print(' '.join(out))

票数 1

Stack Overflow用户

发布于 2019-05-21 03:29:27

看起来您并没有告诉您的代码查找.或!，只是为了将文本分成8个单词的块。这里有一个解决方案：

buffer = []
output = []

with open("file.txt") as c:
    for word in c.split(" "):
        buffer.append(word)
        if '!' in word or '.' in word and len(buffer) > 7:
            output.append(' '.join(buffer))
            buffer = []

print output

它接收一个单词列表，在空格中拆分。它会将words添加到buffer中，直到满足您的条件(word包含标点符号并且缓冲区长度超过7个字)。然后，它将该buffer附加到您的output并清除该buffer。

我不知道您的文件是如何构造的，所以我使用c作为一长串句子进行了测试。您可能需要对输入进行一些修改，才能使其以此代码所期望的方式出现。

票数 1

Stack Overflow用户

发布于 2019-05-21 03:18:49

我不确定如何使用理解列表来实现这一点，但您可以尝试使用常规的for循环来完成它。

with open("file.txt") as c:
    for line in c:
        text = line.split()
        n = 8
        temp = []
        listword = []
        for val in text:
            if len(temp) < n or (not val.endswith('!') and not val.endswith('.')):
              temp.append(val)
            else:
                temp.append(val)
                listword.append(' '.join(temp))
                temp = []
        if temp:  # if last line has less than 'n' words, it will append last line
            listword.append(' '.join(temp))

for lsb in listword:
    print(lsb)

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/56226602

复制

相似问题

问Python将字符串拆分为下一个句点标点符号
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python将字符串拆分为下一个句点标点符号EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python将字符串拆分为下一个句点标点符号
EN