前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >python爬虫之三:解析网络报文xml

python爬虫之三:解析网络报文xml

作者头像
py3study
发布2020-01-07 21:04:57
1.2K0
发布2020-01-07 21:04:57
举报
文章被收录于专栏:python3python3

本节主要是讲解在项目中怎么解析获取的xml报文并获取相关字段。 xml解析第三方库学习地址:http://www.runoob.com/python/python-xml.html

xml文件如下:

代码语言:javascript
复制
<?xml version="1.0" encoding="UTF-8"?>
<Task version="1.3" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
  <RegistrationInfo>
    <Date>2018-03-19T03:57:44.2908045</Date>
    <Author>FANBINGLIN\Administrator</Author>
    <Description>开机提醒事件</Description>
  </RegistrationInfo>
  <Triggers>
    <LogonTrigger>
      <Enabled>true</Enabled>
    </LogonTrigger>
  </Triggers>
  <Principals>
    <Principal id="Author">
      <UserId>FANBINGLIN\Administrator</UserId>
      <LogonType>InteractiveToken</LogonType>
      <RunLevel>LeastPrivilege</RunLevel>
    </Principal>
  </Principals>
  <Settings>
    <MultipleInstancesPolicy>IgnoreNew</MultipleInstancesPolicy>
    <DisallowStartIfOnBatteries>true</DisallowStartIfOnBatteries>
    <StopIfGoingOnBatteries>true</StopIfGoingOnBatteries>
    <AllowHardTerminate>true</AllowHardTerminate>
    <StartWhenAvailable>false</StartWhenAvailable>
    <RunOnlyIfNetworkAvailable>false</RunOnlyIfNetworkAvailable>
    <IdleSettings>
      <StopOnIdleEnd>true</StopOnIdleEnd>
      <RestartOnIdle>false</RestartOnIdle>
    </IdleSettings>
    <AllowStartOnDemand>true</AllowStartOnDemand>
    <Enabled>true</Enabled>
    <Hidden>false</Hidden>
    <RunOnlyIfIdle>false</RunOnlyIfIdle>
    <DisallowStartOnRemoteAppSession>false</DisallowStartOnRemoteAppSession>
    <UseUnifiedSchedulingEngine>false</UseUnifiedSchedulingEngine>
    <WakeToRun>false</WakeToRun>
    <ExecutionTimeLimit>P3D</ExecutionTimeLimit>
    <Priority>7</Priority>
  </Settings>
  <Actions Context="Author">
    <ShowMessage>
      <Title>每日提醒</Title>
      <Body>
1、掌握python基本语法,3.19-3.24 
2、VBA程序研究
3、工作任务总结</Body>
    </ShowMessage>
  </Actions>
</Task>

解析的代码(中间有部分调试文件):

代码语言:javascript
复制
#!/usr/bin/python3
#coding:utf-8

from xml.dom.minidom import parse
import xml.dom.minidom
Root = xml.dom.minidom.parse('开机提醒.xml')
# print(dir(DOMTree))
task = Root.documentElement
# print(dir())
for line in task.childNodes:
    # print('line.nodeName:',line.nodeName,'line.nodeType:',line.nodeType,'line.nodeValue:',line.nodeValue,'line.normalize:',line.normalize)
    # print(len(line))
    # print(line)
    if 3 == line.nodeType:
        continue
    if 'Actions' == line.nodeName:

        for tmp in line.childNodes:
            # print(tmp)
            if 3 == tmp.nodeType:
                continue
            # print(tmp)
            for tmp1 in tmp.childNodes:
                if 3 == tmp1.nodeType:
                    continue     
                for tmp2 in tmp1.childNodes:
                    # print(tmp2)
                    # if 3 == tmp2.nodeType:
                    #   continue
                    print(tmp2.nodeValue)
    # for line1 in line.childNodes:
    #   if 3 == line1.nodeType:
    #       continue
    #   # print(line1.nodeName)
    #   # print(dir(line1))

    #   for line2 in line1.childNodes:
    #       if 3 == line2.nodeType:
    #           continue
            # print(line2.nodeValue)
            # print(line2.data)

效果图:

这里写图片描述
这里写图片描述
本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2019-09-02 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档