问Lxml解析HTML中DIV内部标记
EN

Stack Overflow用户

提问于 2017-11-07 21:49:38

回答 1查看 581关注 0票数 0

我想从一个网站解析大的HTML文本。所以我已经解析了Div，现在我想要标签中的内容，例如：

<div id="lala"><p>I WANT</p> <ul><li>THIS</li></ul>. <p>All of them</p></div>

这是我的以下代码：

patchpage = requests.get(href)
        tree = html.fromstring(patchpage.content)
        patch_message = tree.xpath('//div[@class="messageText"]')
        for item in patch_message:
            await client.say(item.text.strip())  # This is bugging and give me error
        return await client.say(patch_message)

目前，patch_message为我提供了：

[<Element div at 0x29c4be2fa98>]

除了:/谁能告诉我如何将div内容解析成python吗？

python

lxml

discord.py

回答 1

Stack Overflow用户

发布于 2017-11-07 23:25:46

假设您得到的错误是AttributeError: 'NoneType' object has no attribute 'strip'

您只需要排除None对象，使其不被剥离。

for item in patch_message:
    if item.text:
        print item.text.strip()

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/47159610

复制

相似问题

问Lxml解析HTML中DIV内部标记
EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Lxml解析HTML中DIV内部标记EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Lxml解析HTML中DIV内部标记
EN