假设 zh-pm.com 存在世俱杯历史数据页面(如 http://zh-pm.com/fifa-club-world-cup-winners),页面结构通常包含:
页面结构示例(HTML 简化版):
html
<table class="winners-table">
<tr>
<th>年份</th>
<th>冠军球队</th>
<th>国家</th>
<th>决赛比分</th>
</tr>
<tr>
<td>2023</td>
<td>曼城</td>
<td>英格兰</td>
<td>4-0</td>
</tr>
<!-- 更多数据行... -->
</table>bash
pip install requestsbash
pip install beautifulsoup4python
import requests
from bs4 import BeautifulSoup
url = "http://zh-pm.com/fifa-club-world-cup-winners"
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'}
try:
response = requests.get(url, headers=headers)
response.raise_for_status() # 检查请求是否成功
html_content = response.text
except requests.exceptions.RequestException as e:
print(f"网页请求失败: {e}")
exit()python
soup = BeautifulSoup(html_content, 'html.parser')
# 根据实际页面结构调整选择器
table = soup.find('table', class_='winners-table') # 通过类名定位
# 或 table = soup.find('table', {'id': 'data-table'}) # 通过ID定位
if not table:
print("未找到数据表格,请检查页面结构或选择器")
exit()python
champions_data = []
# 跳过表头行(通常为第一行)
for row in table.find_all('tr')[1:]:
cols = row.find_all('td')
if len(cols) >= 4: # 确保有足够列
year = cols[0].text.strip()
team = cols[1].text.strip()
country = cols[2].text.strip()
score = cols[3].text.strip()
champions_data.append({
'Year': year,
'Champion': team,
'Country': country,
'Final Score': score
})python
import csv
csv_file = 'fifa_club_world_cup_winners.csv'
csv_columns = ['Year', 'Champion', 'Country', 'Final Score']
try:
with open(csv_file, 'w', newline='', encoding='utf-8-sig') as f:
writer = csv.DictWriter(f, fieldnames=csv_columns)
writer.writeheader()
writer.writerows(champions_data)
print(f"数据已保存至 {csv_file}")
except IOError:
print("写入文件时发生错误")fake_useragent 库)time.sleep(2))原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。