我正在尝试使用分页,并在抓取完当前页面后转到下一页。这是我第一次抓取API,所以我有点迷路了,还没有在互联网上找到任何东西。
问:我需要做什么才能进入下一页?
代码(我到目前为止拥有的代码):
import pandas as pd
import requests, re
from bs4 import BeautifulSoup
from urllib.parse import urljoin
import json
url = 'https://games.crossfit.com/competitions/api/v1/competitions/open/2018/leaderboards?division=1®ion=0&scaled=0&sort=0&occupation=0&page=1'
nameList = []
genderList = []
regionList = []
gymList = []
ageList = []
heightList = []
weightList = []
ordList = []
overallList = []
overallScoreList = []
response = requests.get(url)
data = response.text
parsed = json.loads(data)
year = parsed['competition']['year']
comp = parsed['competition']['competitionType']
year = parsed['competition']['year']
board = parsed['leaderboardRows']
for all in board:
name = all['entrant']['competitorName']
gender = all['entrant']['gender']
region = all['entrant']['regionName']
gym = all['entrant']['affiliateName']
age = all['entrant']['age']
overall = all['overallRank']
overallS = all['overallScore']
height = all['entrant']['height']
weight = all['entrant']['weight']
nameList.append(name)
genderList.append(gender)
regionList.append(region)
gymList.append(gym)
ageList.append(age)
heightList.append(height)
weightList.append(weight)
overallList.append(overall)
overallScoreList.append(overallS)
发布于 2018-07-04 15:51:51
简单快捷的方法如下所示:
import requests
url = 'https://games.crossfit.com/competitions/api/v1/competitions/open/2018/leaderboards?division=1®ion=0&scaled=0&sort=0&occupation=0&page={}'
for link in [url.format(page) for page in range(1,5)]:
response = requests.get(link)
for item in response.json()['leaderboardRows']:
name = item['entrant']['competitorName']
print(name)
发布于 2018-07-04 08:48:57
crossfit API在pagination
部分提供了所有必要的信息。它会给你类似这样的东西:
"pagination":
{
"currentPage":1,
"totalPages":3440,
"totalCompetitors":171977
},
要获取非1的页面,您需要更改url中的get参数:编写&page=2
而不是&page=1
。最好使用可以传递相关参数的函数来构建url,例如,url_for_page(20)将返回https://games.crossfit.com/competitions/api/v1/competitions/open/2018/leaderboards?division=2®ion=0&scaled=0&sort=0&occupation=0&page=20
我希望这对你有帮助。
https://stackoverflow.com/questions/51164400
复制相似问题