我编写了一个代码,将xml数据转换为字典列表并加载到表中。
输入文件数据:
<report>
<report_header type='comp1' title='industry' year='2019' />
<report_body age='21'>
<Prod name='krishna' id='11' place='usa'>
<License state='aus' area= 'street1'>
</License>
<License state='mus' area= 'street2'>
</License>
<License state='mukin' area= 'street3'>
</License>
</Prod>
<Prod name='ram' id='12' place='uk'>
<License state='junej' area= 'street4'>
</License>
<License state='rand' area= 'street5'>
</License>
<License state='gandhi' area= 'street6'>
</License>
</Prod>
<Prod name='chand' id='13' place='london'>
<License state='nehru' area= 'street7'>
</License>
<License state='mahatma' area= 'street8'>
</License>
<License state='park' area= 'street9'>
</License>
</Prod>
</report_body>
</report>
代码:
import xml.etree.ElementTree as ET
tree = ET.parse('sample.xml')
root = tree.getroot()
way_list=[]
for item in root.iter():
way_list.append(dict(item.attrib))
for k, v in [(k, v) for x in way_list for (k, v) in x.items()]:
print(k,v)
输出:类型comp1
标题产业
2019年
21岁
克里希纳
id 11
美国广场
澳州
面积street1
状态毛里求斯
面积street2
状态穆金
面积street3
名称公羊
id 12
place uk
州府j
面积street4
国家兰特
面积street5
甘地国
面积street6
chand
id 13
伦敦广场
尼赫鲁州
面积street7
mahatma州
面积street8
国家公园
面积street9
预期产出: {type:'comp1',标题:‘industry’,年份:2019,年龄:21,名称:‘krishna’,id:11,place:'usa',state :'aus',area:'street1'},{type:'comp1',标题:‘industry’,年份:2019,年龄:21岁,姓名:‘krishna’,id:11,place:'usa',state :'mus',area:‘street2 2’},{type:'comp1',标题:‘industry’,年份:2019。名称:‘krishna’,id:11,place:'usa',state :'muskin',area:‘street4 3’},{type:'comp1',标题:‘工业’,年份:2019年,年龄:21,名称:‘ram’,id:12,place:'uk',state :'junej',area:‘street4 4’},{type:'comp1',标题:‘工业’,年份:2019,年龄:21,名称:‘ram’,id:12,place:'uk',state :'rand',area:‘et5’},.........etc
我的主要目标是将数据加载到如下表中:
类型,头衔,年份,名称,id,地点,状态,区域
comp1,工业,2019年,克里希纳,11岁,美国,澳大利亚,street1
comp1,工业,2019年,克里希纳,11岁,美国,毛里求斯,street2
comp1,工业,2019年,克里希纳,11岁,美国,马斯金,street3
comp1,工业,2019年,ram,12岁,英国,junej,street4
comp1,工业,2019年,ram,12岁,英国,兰德,street5
comp1,工业,2019年,拉姆,12岁,英国,甘地,street6
现在,我可以把数据转换成字典列表。
发布于 2019-11-11 04:20:29
这里有一条路。在csv模块上阅读。
import csv, os, sys, io
from xml.etree import ElementTree
data = """\
<report>
<report_header type='comp1' title='industry' year='2019' />
<report_body>
<Prod name='krishna' id='11' place='usa'>
<License state='aus' area= 'street1'>
</License>
<License state='mus' area= 'street2'>
</License>
<License state='mukin' area= 'street3'>
</License>
</Prod>
<Prod name='ram' id='12' place='uk'>
<License state='junej' area= 'street4'>
</License>
<License state='rand' area= 'street5'>
</License>
<License state='gandhi' area= 'street6'>
</License>
</Prod>
<Prod name='chand' id='13' place='london'>
<License state='nehru' area= 'street7'>
</License>
<License state='mahatma' area= 'street8'>
</License>
<License state='park' area= 'street9'>
</License>
</Prod>
</report_body>
</report>
"""
fieldnames = ['type', 'title', 'year', 'name', 'id', 'place', 'state', 'area']
writer = csv.DictWriter(sys.stdout, fieldnames=fieldnames)
writer.writeheader()
tree = ElementTree.parse(io.StringIO(data))
report_header = tree.find('report_header')
report_body = tree.find('report_body')
for Prod in report_body.findall('Prod'):
for License in Prod.findall('License'):
d = {}
d.update(License.attrib)
d.update(Prod.attrib)
d.update(report_header.attrib)
writer.writerow(d)
发布于 2019-11-11 08:15:46
只使用ElementTree。
import xml.etree.ElementTree as ET
tree = ET.parse('sample.xml')
root = tree.getroot()
dict_rep= root.find('report_header').attrib
dict_rep.update(root.find('report_body').attrib)
way_list=[]
for prod in root.iter('Prod'):
dict_line = dict_rep
dict_line.update(prod.attrib)
for lic in prod.iter('License'):
dict_line.update(lic.attrib)
print(dict_line)
way_list.append(dict_line)
https://stackoverflow.com/questions/58795302
复制相似问题