文章/答案/技术大牛

发布

问BeautifulSoup出口
EN

Stack Overflow用户

提问于 2018-09-25 00:59:19

回答 1查看 68关注 0票数 0

我正在运行下面的代码，并获得如下输出

class="widget" id="dnf_class_values_procurement_notice__classification_code__widget">\n\tR -- Professional, administrative, and management support services\n\t\t

我想要的只是专业、行政和管理支持服务

如何处理输出中包含的所有其他文本？我在Python中使用BeautifulSoup

i = "https://www.fbo.gov/index.php?s=opportunity&mode=form&id=50e3e1ec75e2aaa7c4fca7761e4c46a2&tab=core&_cview=1"
response = requests.get(i)
textfield = response.text
soup = BeautifulSoup(textfield, 'lxml')
tags = soup.find_all(attrs={'id':'dnf_class_values_procurement_notice__classification_code__widget'})
tags

python

web

beautifulsoup

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-09-25 01:09:27

soup.find()从HTML源获取整个HTML对象。您应该使用.text对象的Tag param或.contents数组来获取所需的文本。因此，strip()部件将从字符串中删除\n\t标记。

还有一个提示:我会在这部分代码中使用soup.find()。find_all()可以返回一个匹配搜索条件的对象数组。在搜索id时，您应该期望获得一个元素作为响应，因此find()是一个更合适的函数。

i = "https://www.fbo.gov/index.php?s=opportunity&mode=form&id=50e3e1ec75e2aaa7c4fca7761e4c46a2&tab=core&_cview=1"
response = requests.get(i)
textfield = response.text
soup = BeautifulSoup(textfield, 'lxml')
tags = soup.find(attrs={'id':'dnf_class_values_procurement_notice__classification_code__widget'}).text.strip()
tags

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/52489188

复制

相似问题

问BeautifulSoup出口
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问BeautifulSoup出口EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问BeautifulSoup出口
EN