当我尝试GET
一个有请求的网页时,我成功地得到了页面,而链接却存储在一个str变量中。然而,当我尝试获取一个字符串数组的元素时,我无法检索到页面。
输入1:
import requests
from bs4 import BeautifulSoup
import re
f = open("pages.txt","r")
file = open("parsed.txt","a")
content = f.readlines()
for i in range(1):
a="http://registration.boun.edu.tr/scripts/sch.asp?donem=2017/2018-3&kisaadi=BM&bolum=BIOMEDICAL+ENGINEERING"
print(a + " " + str(type(a) ) )
req_link=a
r=requests.get(req_link)
c=r.content
soup=BeautifulSoup(c,"html.parser")
all=soup.find_all("td")
print(all[38])
Output1:
PS E:\pythonCodes\BounCP> python .\getClasses.py
http://registration.boun.edu.tr/scripts/sch.asp?donem=2017/2018-3&kisaadi=BM&bolum=BIOMEDICAL+ENGINEERING <class 'str'>
<td><font style="font-size:12px">BM 519.01</font> </td>
输入2:
import requests
from bs4 import BeautifulSoup
import re
f = open("pages.txt","r")
file = open("parsed.txt","a")
content = f.readlines()
for i in range(1):
a=content[1]
print( content[1] + " "+ str(type(content[1]) ) )
req_link=a
r=requests.get(req_link)
c=r.content
soup=BeautifulSoup(c,"html.parser")
all=soup.find_all("td")
#all=all[38:]
print(all)
Output2:
PS E:\pythonCodes\BounCP> python .\getClasses.py
http://registration.boun.edu.tr/scripts/sch.asp?donem=2017/2018-3&kisaadi=BM&bolum=BIOMEDICAL+ENGINEERING
<class 'str'>
[]
发布于 2018-06-17 05:47:15
通过查看<class 'str'>
之前的输出值,您应该在文件的行尾有一个换行符
尝试使用
a=content[1].strip()
https://stackoverflow.com/questions/50891971
复制相似问题