我一直在尝试创建一个程序,该程序必须在文件中读取,找到唯一的单词和标点符号,将它们放到列表中,然后获取每个单词的位置并将它们存储在列表中。然后,使用列表程序将重新创建文件。这是我的密码:
import time
import re
words = open('words.txt')
sentence = words.read()
uniquewords = []
positions = []
punctuation = re.findall(r"[\w']+|[.,!?;]", sentence)
for word in punctuation:
if word not in uniquewords:
uniquewords.append(word)
print("This file contains the words and punctuation ", uniquewords)
positions = [uniquewords.index(word) for word in punctuation]
recreated = " ".join([uniquewords[i] for i in positions])
print("In a list the text file words.txt can be shown as:")
print(positions)
print("Recreating sentence...")
print(recreated)
上面的程序完成了它所需的工作,但它产生的输出如下:
该文件包含单词和标点符号“询问”、“不”、“什么”、“您的”、“国家”、“能”、“做”、“用于”、“您”、“!” 在列表中,文本文件words.txt可以显示为: 0,1,2,3,4,5,6,7,8,9,0,2,8,5,6,7,3,4,10 重现句子..。 不要问你的国家能为你做些什么,要问你能为你的国家做些什么!
“职位”列表从0开始,因此,与正常情况一样,我尝试这样做:
positions = [uniquewords.index(word)+1 for word in punctuation]
但是,这会产生错误。
File "C:\Users\Sam\Desktop\COMPUTING TEMP FOLDER\task 3.py", line 13, in <module>
recreated = " ".join([uniquewords[i] for i in positions])
File "C:\Users\Sam\Desktop\COMPUTING TEMP FOLDER\task 3.py", line 13, in <listcomp>
recreated = " ".join([uniquewords[i] for i in positions])
IndexError: list index out of range
,如何使列表从1开始,而不出现此错误?如果有任何帮助,我们将不胜感激.
另一个小问题是,当原始字符串是
“不要问你的国家能为你做些什么,要问你能为你的国家做些什么!”
实际的输出是
不要问你的国家能为你做些什么,要问你能为你的国家做些什么!
发布于 2016-08-30 14:47:49
问题是,您正在递增positions
的每个元素,使其显示为1索引,然后在python期望0索引时使用该数组。试着使用:
recreated = " ".join([uniquewords[i-1] for i in positions])
相反,
发布于 2016-08-30 14:50:31
请检查下面的代码。我更改了重新创建字符串的位置,以解决空间问题以及您所面临的索引问题。
import time
import re
words = open("val.txt",'r')
sentence = words.readline()
uniquewords = []
positions = []
punctuation = re.findall(r"[\w']+|[.,!?;]", sentence)
for word in punctuation:
if word not in uniquewords:
uniquewords.append(word)
print("This file contains the words and punctuation ", uniquewords)
positions = [uniquewords.index(word)+1 for word in punctuation]
#recreated = " ".join([uniquewords[i-1] for i in positions])
recreated = ''
for i in positions:
w = uniquewords[i-1]
if w not in '.,!?;':
w = ' ' + w
recreated = (recreated + w).strip()
print("In a list the text file words.txt can be shown as:")
print(positions)
print("Recreating sentence...")
print(recreated)
输出:
C:\Users\dinesh_pundkar\Desktop>python c.py
('This file contains the words and punctuation ', ['Ask', 'not', 'what', 'your',
'country', 'can', 'do', 'for', 'you', ',', '!'])
In a list the text file words.txt can be shown as:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 3, 9, 6, 7, 8, 4, 5, 11]
Recreating sentence...
Ask not what your country can do for you, Ask what you can do for your country!
https://stackoverflow.com/questions/39230444
复制相似问题