我正在使用Stopword过滤器,我给脚本提供了一个包含文章的文件的路径。然而,我得到了错误:
Traceback (most recent call last):
File "stop2.py", line 17, in <module>
print preprocess(sentence)
File "stop2.py", line 10, in preprocess
sentence = sentence.lower()
AttributeError: 'file' object has no attribute 'lower'关于如何将文件作为参数传递的想法,我的代码也附在下面
# -*- coding: utf-8 -*-
from __future__ import division, unicode_literals
import string
import nltk
from nltk.tokenize import RegexpTokenizer
from nltk.corpus import stopwords
import re
def preprocess(sentence):
sentence = sentence.lower()
tokenizer = RegexpTokenizer(r'\w')
tokens = tokenizer.tokenize(sentence)
filtered_words = [w for w in tokens if not w in stopwords.words('english')]
return " ".join(filtered_words)
sentence = open('pathtofile')
print preprocess(sentence)发布于 2016-11-25 05:49:53
sentence = open(...)表示语句是file的一个实例(从open()方法返回);
而您似乎希望拥有文件的全部内容:sentence = open(...).read()
https://stackoverflow.com/questions/40794878
复制相似问题