1.工具用 Pycharm
,Python3.6
2.在 Pycharm 中的Settings->Project:code->Project Interpreter
点击右上角绿色’+’输入requests和 BeautifulSoup选择并安装,BeautifulSoup安装时可能会出现错误,请自行百度查询。
3.代码如下
import requests
from bs4 import BeautifulSoup
urls = []
city=["nanchang","chongqing","shanghai","beijing","hangzhou"]
for c in range(0,city.__len__()):
for i in range(12, 18):
if i < 10:
i = str(0) + str(i)
for j in range(1, 12):
if j < 10:
j = str(0) + str(j)
urls.append("http://lishi.tianqi.com/"+city[c]+"/20" + str(i) + str(j) + ".html")
file = open('wuhan_weather.txt','w')
for url in urls:
print(url[24:32])
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
weather_list = soup.select('div[class="tqtongji2"]')
for weather in weather_list:
if weather.select('a').__len__()>1:
weather_date = weather.select('a')[0].string.encode('utf-8')
ul_list = weather.select('ul')
i = 0
for ul in ul_list:
li_list = ul.select('li')
str = url[24:32]+":"
for li in li_list:
if li.string!=None:
str += (li.string + ",").encode('utf-8').decode()
if i != 0:
file.write(str + '\n')
i += 1
file.close()
4.在同目录文件夹下找到
wuhan_weather.txt
同时也可以将这些数据存入到Excel表中,只需将后缀 .txt改为 .csv 即可。
5.成果如下,共由9k+数据。