首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >XML到CSV ruby

XML到CSV ruby
EN

Stack Overflow用户
提问于 2019-01-15 01:39:02
回答 1查看 174关注 0票数 1

我有多个XML示例文件,我想在CSV中转换它们,但不同的XML文件将有多个不同的属性/节点,因此我不想硬编码不同的属性。我希望输出将列标题显示为第一行,然后每个节点/记录都垂直显示,就像传统的列和行电子表格一样。下面是一个示例XML:

代码语言:javascript
运行
复制
<?xml version="1.0" encoding="UTF-8"?>
<sd:root xmlns:wd="urn:com.sample/bsvc" sd:version="v31.0">
  <sd:Put_Job_Profile_Request sd:Add_Only="0">
    <sd:Job_Profile_Data>
      <sd:Job_Code>30000</sd:Job_Code>
      <sd:Effective_Date>1900-01-01</sd:Effective_Date>
      <sd:Job_Profile_Basic_Data>
        <sd:Job_Title>Chief Executive Officer</sd:Job_Title>
      </sd:Job_Profile_Basic_Data>
    </sd:Job_Profile_Data>
  </sd:Put_Job_Profile_Request>
  <sd:Put_Job_Profile_Request sd:Add_Only="0">
    <sd:Job_Profile_Data>
      <sd:Job_Code>30100</sd:Job_Code>
      <sd:Effective_Date>1900-01-01</sd:Effective_Date>
      <sd:Job_Profile_Basic_Data>
        <sd:Job_Title>Administrator Job Profile</sd:Job_Title>
      </sd:Job_Profile_Basic_Data>
    </sd:Job_Profile_Data>
  </sd:Put_Job_Profile_Request>
  <sd:Put_Job_Profile_Request sd:Add_Only="0">
    <sd:Job_Profile_Data>
      <sd:Job_Code>30200</sd:Job_Code>
      <sd:Effective_Date>1900-01-01</sd:Effective_Date>
      <sd:Job_Profile_Basic_Data>
        <sd:Inactive>0</sd:Inactive>
        <sd:Job_Title>Facilities &amp; Grounds Maintenance Attendant</sd:Job_Title>
        <sd:Include_Job_Code_in_Name>0</sd:Include_Job_Code_in_Name>
        <sd:Job_Profile_Private_Title>Maintenance Job Title</sd:Job_Profile_Private_Title>
        <sd:Job_Profile_Summary>Maintain cleanliness of the campus building throughout the day and fulfill special requests as needed.</sd:Job_Profile_Summary>
        <sd:Job_Description>&lt;p>Job Description&lt;b> rich text!&lt;/b>&lt;/p></sd:Job_Description>
        <sd:Additional_Job_Description>&lt;p>&lt;b>&lt;i>&lt;span class="emphasis-2">&lt;u>Additional&lt;/u>&lt;/span>&lt;/i>&lt;/b> Job Description&lt;b> rich text!&lt;/b>&lt;/p></sd:Additional_Job_Description>
        <sd:Work_Shift_Required>0</sd:Work_Shift_Required>
        <sd:Public_Job>1</sd:Public_Job>    
      </sd:Job_Profile_Basic_Data>
    </sd:Job_Profile_Data>
  </sd:Put_Job_Profile_Request>
  <sd:Put_Job_Profile_Request sd:Add_Only="0">
    <sd:Job_Profile_Data>
      <sd:Job_Code>30300</sd:Job_Code>
      <sd:Effective_Date>1900-01-01</sd:Effective_Date>
      <sd:Job_Profile_Basic_Data>
        <sd:Inactive>0</sd:Inactive>
        <sd:Job_Title>Sample_Job_Title</sd:Job_Title>
        <sd:Include_Job_Code_in_Name>0</sd:Include_Job_Code_in_Name>
        <sd:Job_Profile_Summary>Sample Job Profile Summary</sd:Job_Profile_Summary>
        <sd:Job_Description>Sample Job Description</sd:Job_Description>
        <sd:Additional_Job_Description>Sample Additional Job Description</sd:Additional_Job_Description>
        <sd:Work_Shift_Required>1</sd:Work_Shift_Required>
      </sd:Job_Profile_Basic_Data>
    </sd:Job_Profile_Data>
  </sd:Put_Job_Profile_Request>
</sd:root>

我使用的代码是错误的:

代码语言:javascript
运行
复制
require 'csv'
require 'nokogiri'

file = File.read('jobProfile.xml')
doc = Nokogiri::XML(file)
a = []

CSV.open('xmloutput.csv', 'wb') do |csv|
  csv << doc.at('.').search('*').map(&:name)
  doc.search('.').each do |x|
    csv << x.search('*').map(&:text)
  end
end

每组记录的列标题和数据都是水平设置的。但是我想迭代数据并保留一行列标题。我不确定如何在不对每个属性进行硬编码的情况下做到这一点:/请帮助,因为我仍然是编程新手,并且我已经尝试了一个星期来寻找解决方案:(

screenshot showing the csv output

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-01-15 05:08:48

您需要首先构建一个散列数组,并提取键作为头,然后将值放在正确的列中,所有节点都被展平为列,忽略根键和记录键。

像这样的东西

代码语言:javascript
运行
复制
require 'nokogiri'
require 'set'

file    = File.read('jobProfile.xml')
doc     = Nokogiri::XML(file)
record  = {}
keys    = Set.new
records = []
csv     = ""

doc.traverse do |node| 
  value = node.text.gsub(/\n +/, '')
  if node.name
    if node.name != "text" # skip these nodes
      if value.length > 0 # skip empty nodes
        key = node.name.gsub(/sd:/,'').to_sym
        # if a new and not empty record, add to our records collection
        if key == :Job_Profile_Data && !record.empty?
          records << record
          record = {}
        elsif key[/Job_Profile|^root$|^document$/]
          # neglect these keys
        else
          key = node.name.gsub(/sd:/,'').to_sym
          # in case our value is html instead of text
          record[key] = Nokogiri::HTML.parse(value).text
          # add to our key set only if not allready in the set
          keys << key
        end
      end
    end
  end
end

# build our csv
File.open('./xmloutput.csv', 'w') do |file|
  file.puts %Q{"#{keys.to_a.join('","')}"}
  records.each do |record|
    keys.each do |key|
      file.write %Q{"#{record[key]}",}
    end
    file.write "\n"
  end
end

它在我们的csv文件中给出了以下内容

代码语言:javascript
运行
复制
"Job_Code","Effective_Date","Job_Title","Inactive","Include_Job_Code_in_Name","Job_Description","Additional_Job_Description","Work_Shift_Required","Public_Job"
"30000","1900-01-01","Chief Executive Officer","","","","","","",
"30100","1900-01-01","Administrator Job Profile","","","","","","",
"30200","1900-01-01","Facilities & Grounds Maintenance Attendant","0","0","Job Description rich text!","Additional Job Description rich text!","0","1",
"30300","1900-01-01","Sample_Job_Title","0","0","Sample Job Description","Sample Additional Job Description","1","",
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/54186558

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档