我正在尝试做一个网络刮刀,将从一个网站上拉表,然后粘贴到一个excel电子表格。我是一个极端的Python初学者(和一般的编程)--从字面上来说是几天前开始学习的。
那么,我如何让这个网络爬虫/爬虫呢?下面是我的代码:
import csv
import requests
from BeautifulSoup import BeautifulSoup
url = 'https://www.techpowerup.com/gpudb/?mobile=0&released%5B%5D=y14_c&released%5B%5D=y11_14&generation=&chipname=&interface=&ushaders=&tmus=&rops=&memsize=&memtype=&buswidth=&slots=&powerplugs=&sort=released&q='
response = requests.get(url)
html = response.content
soup = BeautifulSoup(html)
table = soup.find('table', attrs={'class': 'processors'})
list_of_rows = []
for row in table.findAll('tr')[1:]:
list_of_cells = []
for cell in row.findAll('td'):
text = cell.text.replace(' ', '')
list_of_cells.append(text)
list_of_rows.append(list_of_cells)
outfile = open("./GPU.csv", "wb")
writer = csv.writer(outfile)
writer.writerow(["Product Name", "GPU Chip", "Released", "Bus", "Memory", "GPU clock", "Memory clock", "Shaders/TMUs/ROPs"])
writer.writerows(list_of_rows)
现在,该程序适用于上面代码中显示的网站。
现在,我想从以下网站上摘取表格:https://www.techpowerup.com/gpudb/2990/radeon-rx-560d
请注意,此页面上有几个表。我应该添加/更改什么才能使程序在此页面上运行?我正在试着弄到所有的桌子,但如果有人能帮我弄到一张,我会非常感激的!
https://stackoverflow.com/questions/45070818
复制相似问题