我正试图在Centos7服务器上编写一个python (版本2.7.5) CGI脚本。我的脚本试图从librivox的网页下载数据,比如... https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/
,我的脚本失败了,出现了这个错误:
<class 'urllib2.URLError'>: <urlopen error [Errno 13] Permission denied>
args = (error(13, 'Permission denied'),)
errno = None
filename = None
message = ''
reason = error(13, 'Permission denied')
strerror = None
我已经关闭了iptables
,我可以做像‘`wget O- https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/’这样的事情而不会出错。下面是错误发生的代码:
def output_html ( url, appname, doobb ):
print "url is %s<br>" % url
soup = BeautifulSoup(urllib2.urlopen( url ).read())
更新:谢谢Paul和alecxe,我已经将我的代码更新为:
def output_html ( url, appname, doobb ):
#hdr = {'User-Agent':'Mozilla/5.0'}
#print "url is %s<br>" % url
#req = url2lib2.Request(url, headers=hdr)
# soup = BeautifulSoup(urllib2.urlopen( url ).read())
headers = {'User-Agent':'Mozilla/5.0'}
# headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36'}
response = requests.get( url, headers=headers)
soup = BeautifulSoup(response.content)
..。我得到一个稍微不同的错误,当...
response = requests.get( url, headers=headers)
..。被叫到..。
<class 'requests.exceptions.ConnectionError'>: ('Connection aborted.', error(13, 'Permission denied'))
args = (ProtocolError('Connection aborted.', error(13, 'Permission denied')),)
errno = None
filename = None
message = ProtocolError('Connection aborted.', error(13, 'Permission denied'))
request = <PreparedRequest [GET]>
response = None
strerror = None
..。有趣的是,它写了一个命令行版本的脚本,它工作得很好,看起来像这样…
def output_html ( url ):
soup = BeautifulSoup(urllib2.urlopen( url ).read())
你不觉得很奇怪吗?
更新:此问题可能已有答案: urllib2.HTTPError: HTTP错误403:禁止的2个答案
不,他们不回答问题
https://stackoverflow.com/questions/28081350
复制相似问题