我已经使用这个脚本,我找到了一个刮刀工具,收集所有牙医的名字列在网页上。当我运行它时,没有用我正在寻找的聚合数据创建新的csv文件。下面是脚本:
from bs4 import BeautifulSoup as bs
import requests as rq
import csv
url = "https://www.healthgrades.com/usearch?what=Dentistry&where=Canal%20Street%2C%20NY%2010013&city=Canal%20Street&state=NY&pt=40.720901%2C%20-74.008904&zip=10013&neCorner=40.739420717131885%2C-73.98771539161403&swCorner=40.70233998462754%2C-74.03007248950355&mapCenter=40.720901%2C-74.008904&zoomLevel=14.6&mapChanged=false&pageNum=2"
GeT = rq.get(url)
soup = bs(GeT.content, "html.parser")
data_1 = soup.find_all ('div',{'class':'card-content__details'})
doctors_list = []
for item in data_1:
try:
first = item.contents[2].find_all('div',{'class':'details'})[1].text
except:
first = ''
doctors_list.append(first)
with open('newfile.csv','w') as file:
writer = csv.writer(file)
for row in doctors_list:
writer.writerow(row)
发布于 2018-02-22 05:58:07
示例JSON响应:https://bpaste.net/show/fcb53d9bc16f
>>> import requests
...
... BASE_URL = 'https://www.healthgrades.com/api3/usearch'
...
... params = {
... 'userLocalTime':'22:37',
... 'what':'Dentistry',
... 'where':'Canal Street, NY 10013',
... 'pt':'40.720901, -74.008904',
... 'sort.provider':'bestmatch',
... 'category':'provider',
... 'sessionId':'Sb93293f932c6bc56',
... 'requestId':'Rac7ffe6e6256eba3',
... 'pageSize.provider':'20',
... 'pageNum':'2',
... 'isFirstRequest':'true',
... 'debug':'false',
... 'isAtlas':'true',
... 'action':'refresh',
... 'neCorner':'40.744526282819244,-73.99060337556104',
... 'swCorner':'40.69723118200452,-74.02724113269275'
... }
... r = requests.get(BASE_URL, params=params)
... r.raise_for_status()
>>> dentists = r.json()['search']['searchResults']['provider']['results']
>>> for dentist in dentists:
... print(dentist['displayName'])
...
Dr. Raphael Santore, DDS
Dr. Gain Lu, DDS
Dr. Molly Lim, DDS
Dr. Anne Yu, DDS
Dr. Charmaine Ip, DMD
Dr. Devi Konar, DDS
Dr. Christopher Perez, DMD
Dr. Lee Gold, DDS
Dr. Elaine Wong, DDS
Dr. Fan Mou, DDS
Dr. Henry Wong, DDS
Dr. Shauna Fung, DDS
Dr. Emilie Fong, DDS
Dr. Nancy Ma, DDS
Dr. Charles Tiu, DDS
Dr. Glenn Chiarello, DDS
Dr. John Nosti, DMD
Dr. Loi Chan, DDS
Dr. Charles Hashim, DDS
Dr. David Azar, DDS
Dr. Jenny Zhu, DDS
Dr. Stanton Young, DMD
Dr. Pankaj Singh, DDS
Dr. Lawrence Tam, DDS
Dr. Alina Lukashevsky, DDS
Dr. Maureen Khoo, DDS
Dr. Mailin Lai, DDS
Dr. Stewart Neidle, DDS
Danielle Danzi, DDM
Dr. Justin Cohen, DMD
Dr. Weihsin Men, DMD
Dr. Anthony Kail, DDS
Sima Epstein
Dr. Christian Bilius, DDS
Dr. Jeffrey Shapiro, DDS
Dr. Donald Ingerman, DDS
>>> featured_dentists = r.json()['featuredProviders']
>>> for dentist in featured_dentists:
... print(dentist['displayName'])
...
Dr. Ora Canter, DDS
Dr. Alfred Shirzadnia, DDS
Dr. Henry Nogid, DDS
https://stackoverflow.com/questions/48919871
复制相似问题