首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >如何使用Python无重复地添加XML元素

如何使用Python无重复地添加XML元素
EN

Stack Overflow用户
提问于 2018-08-02 20:41:59
回答 1查看 105关注 0票数 0

使用Python提取元数据XML文件,从Python脚本填充各种元素,然后将XML文件保存回其源文件。有一个已有的名为'citeinfo‘的元素,我试图在其中创建几个子元素,一个名为"pubdate“,另一个名为"othercit”。当我运行这个脚本时,我没有得到任何错误,但是当我打开XML后处理时,我得到了第二个元素组,它是"citeinfo“的父元素,并且我的所有新元素都只有一行。这是我的Python:

代码语言:javascript
复制
import arcpy, sys  
from xml.etree.ElementTree import ElementTree  
from xml.etree.ElementTree import Element, SubElement
import xml.etree.ElementTree as ET
from arcpy import env  
env.overwriteOutput = True  
fcpath         = r"...\HL Metadata to BR Sample Data.gdb\NCI20102014_Oral"
translatorpath = r"...\Translator\ARCGIS2FGDC.xml"
xmlfile        = r"...\Extras\FullMetaFC.xml"
arcpy.ExportMetadata_conversion(fcpath, translatorpath, xmlfile)

tree = ElementTree()
tree.parse(xmlfile)

a   = tree.find('idinfo')
aa  = tree.find('metainfo')
aaa = tree.find('eainfo')

b = ET.SubElement(a, 'citation')
c = ET.SubElement(b, 'citeinfo')
bb = ET.SubElement(c, 'pubdate')
d = ET.SubElement(c, 'othercit')
e = ET.SubElement(a, 'descript')
f = ET.SubElement(e, 'abstract')
g = ET.SubElement(e, 'purpose')

title       = tree.find("idinfo/citation/citeinfo/title")
public_date = tree.find("idinfo/citation/citeinfo/pubdate")
cit_source  = tree.find("idinfo/citation/citeinfo/othercit")
abstract    = tree.find("idinfo/descript/abstract")
purpose     = tree.find("idinfo/descript/purpose")

title.text       = "Oral Cancer Incidence by County"
bb.text = "99990088"
d.text  = "https://statecancerprofiles.cancer.gov/"
abstract.text    = "Incidence rates are..."
purpose.text     = "The State Cancer Profiles..."

tree.write(xmlfile)

arcpy.ImportMetadata_conversion(xmlfile, "FROM_FGDC", fcpath, "ENABLED")

下面是XML:

代码语言:javascript
复制
   <citation>
      <citeinfo>
        <origin>X</origin>
        <title>META_TESTING</title>
        <geoform>vector digital data</geoform>
      <pubdate>20102010</pubdate><othercit>www.google.com</othercit></citeinfo>
    </citation>

我希望“引用”组看起来像这样:

代码语言:javascript
复制
    <citation>
      <citeinfo>
        <title>National Cancer Institute, Oral Cancer Incidence by County</title>
        <geoform>vector digital data</geoform>
        <pubdate>20120510</pubdate>
        <othercit>www.google.com</othercit>
      </citeinfo>
    </citation>
EN

回答 1

Stack Overflow用户

发布于 2018-08-02 22:41:57

我会创建一个小助手函数来确保元素的存在。如果它存在,它会返回它-如果不存在,它就会创建它。

代码语言:javascript
复制
def ensure_elem(context, name):
    elem = context.find(name)
    return ET.SubElement(context, name) if elem is None else elem

现在您可以执行以下操作:

代码语言:javascript
复制
tree = ET.parse(xmlfile)

# ensure we have a /metadata/idinfo/citation/citeinfo hierarchy
metadata = tree.getroot()
idinfo = ensure_elem(metadata, "idinfo")
citation = ensure_elem(idinfo, "citation")
citeinfo = ensure_elem(citation, "citeinfo")

# update the text of elements beneath citeinfo
ensure_elem(citeinfo, 'pubdate').text = "new pubdate"
ensure_elem(citeinfo, 'title').text = "new title"
# ...and so on

tree.write(xmlfile)

请注意,您可以在一行代码中ET.parse()一个文件。

为了简单起见,我们可以这样做:

代码语言:javascript
复制
e = ensure_elem

# ensure we have a /metadata/idinfo/citation/citeinfo hierarchy
citeinfo = e(e(e(tree.getroot(), "idinfo"), "citation"), "citeinfo")

要美观地打印ElementTree文档,可以使用以下函数:

代码语言:javascript
复制
def indent(tree, indent_by='  '):
    irrelevant = lambda s: s is None or s.lstrip('\r\n\t\v ') == ''
    indent_str = lambda i: '\n' + indent_by * i

    def indent(elem, level=0, last_child=True):
        if len(elem) and irrelevant(elem.text):
            elem.text = indent_str(level+1)

        if irrelevant(elem.tail):
            elem.tail = indent_str(level-(1 if last_child else 0))

        for i, child in enumerate(elem, 1):
            indent(child, level+1, i==len(elem))

    indent(tree.getroot())
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/51653954

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档