文章/答案/技术大牛

发布

社区首页 >问答首页 >为什么我的网刮函数会返回一些意想不到的东西？

问为什么我的网刮函数会返回一些意想不到的东西？
EN

Stack Overflow用户

提问于 2022-04-23 16:39:42

回答 2查看 46关注 0票数 0

我的目标是:尝试构建一个函数；期望作为输入的def retrieve_title(html)，一个html字符串并返回标题元素。

为了完成这项任务，我引进了漂亮的汤。当我还在学习的时候，任何的指导都是非常感谢的。

我的尝试功能：

def retrieve_title(html):
    soup = [html]
    result = soup.title.text
    return(result)

使用该功能：

html = '<title>Jack and the bean stalk</title><header>This is a story about x y z</header><p>talk to you later</p>'
print(get_title(html))

意外结果：

"AttributeError：'list‘对象没有属性'title'“

预期成果：

“杰克与豆茎”

python

web-scraping

beautifulsoup

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-04-23 17:19:52

Jack and the bean stalk是紧跟在title tag之后的文本节点，因此您可以应用.find(text=True)。

 html = '''
    <title>
     Jack and the beanstalk     
    </title>
    <header>
     This is a story about x y z
    </header>
    <p>
     Once upon a time
    </p>
    '''
    
    from bs4 import BeautifulSoup
    
    soup = BeautifulSoup(html,'html.parser')
    
    #print(soup.prettify())
    
    title=soup.title.find(text=True)
    print(title)

输出：

 Jack and the beanstalk

票数 2

Stack Overflow用户

发布于 2022-04-23 16:41:53

你必须调用这个函数

print(retrieve_title(html))

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/71981675

复制

相似问题

问为什么我的网刮函数会返回一些意想不到的东西？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么我的网刮函数会返回一些意想不到的东西？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么我的网刮函数会返回一些意想不到的东西？
EN