Beautiful Soup,这个库的作用是从 HTML 或 XML 文件中抓出有效数据,用来集成在PYTHON中使用,不过需首先要去官网下载,本文将不讲述如何下载安装,直接开撸代码实现爬虫 :# coding=utf-8
import urllib
from bs4 import BeautifulSoup
url ='http://www.baidu.com/s'
values ={'wd':'美女'}
encoded_param = urllib.urlencode(values)
full_url = url +'?'+ encoded_param
response = urllib.urlopen(full_url)
soup =BeautifulSoup(response)
alinks = soup.find_all('a')