文章/答案/技术大牛

发布

社区首页 >问答首页 >XML组合文档

问XML组合文档
EN

Stack Overflow用户

提问于 2016-03-02 03:29:54

回答 1查看 1.6K关注 0票数 2

我正在开发Boomi接口，我需要将单个xml文档合并到单个输出文档中。标准组合文档步骤工作不正常。

所有xml文档都具有相同的结构。

第一份文件

<?xml version='1.0' encoding='UTF-8'?>
<EMPLEADOS>
  <EMPLEADO TIPO="A" NUMERO="123">
    <PROCESO PERIODO="201603" TT="MN" PAC="9999" />
    <SECCION ID="ETACIV">
      <CAMPO ID="ETA_ETCNOM" SEC=" " FECHA=" ">abc</CAMPO>
    </SECCION>
  </EMPLEADO>
</EMPLEADOS>

第二份文件

<?xml version='1.0' encoding='UTF-8'?>
<EMPLEADOS>
  <EMPLEADO TIPO="A" NUMERO="123">
    <PROCESO PERIODO="201603" TT="MN" PAC="9999" />
    <SECCION ID="SADMIN ">
      <CAMPO ID="SAD_SADESO" SEC=" " FECHA="01/03/2015">01/03/2015</CAMPO>
    </SECCION>
  </EMPLEADO>
</EMPLEADOS>

第三份文件

<?xml version='1.0' encoding='UTF-8'?>
<EMPLEADOS>
  <EMPLEADO TIPO="A" NUMERO="123">
    <PROCESO PERIODO="201603" TT="MN" PAC="9999" />
    <SECCION ID="SADMIN ">
      <CAMPO ID="SAD_SADESO" SEC=" " FECHA="01/06/2015">01/06/2015</CAMPO>
    </SECCION>
  </EMPLEADO>
</EMPLEADOS>

预期产出

<?xml version='1.0' encoding='UTF-8'?>
<EMPLEADOS>
  <EMPLEADO TIPO="A" NUMERO="123">
    <PROCESO PERIODO="201603" TT="MN" PAC="9999" />
     <SECCION ID="ETACIV">
     <CAMPO ID="ETA_ETCNOM" SEC=" " FECHA=" ">abc</CAMPO>
     </SECCION>
     <SECCION ID="SADMIN ">
     <CAMPO ID="SAD_SADESO" SEC=" " FECHA="01/03/2015">01/03/2015</CAMPO>
     <CAMPO ID="SAD_SADESO" SEC=" " FECHA="01/06/2015">01/06/2015</CAMPO>
     </SECCION>
   </EMPLEADO>
</EMPLEADOS>

元素上的合并与属性相同吗？从技术上讲，我需要将所有文档合并到CAMPO的ID属性上。

任何帮助都非常感谢。

谢谢纳格

我尝试了下面的代码；文件过早结束。错误。

import java.util.Properties;
import java.io.InputStream;
import org.jdom.input.SAXBuilder;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.xpath.XPath;
import org.jdom.output.XMLOutputter;
import groovy.util.slurpersupport.GPathResult;
import groovy.xml.StreamingMarkupBuilder;


for( int i = 0; i < dataContext.getDataCount(); i++ ) {
  InputStream is = dataContext.getStream(i);
  Properties props = dataContext.getProperties(i);

def xs = new XmlSlurper()
def employee = xs.parse(is);
String Encod = "UTF-8" ;
HashMap<String, GPathResult> CampoMap = new HashMap<String, GPathResult>()

employee.EMPLEADOS.EMPLEADO.PROCESO.SECCION.CAMPO.each {
    CampoMap["${it.@ID}"] = it
}

new StreamingMarkupBuilder().bind {
     mkp.xmlDeclaration(["version":"1.0", "encoding":"UTF-8"]);
    EMPLEADOS {
        EMPLEADO.PROCESO.SECCION.each {
            if (CampoMap["${it.@ID}"] != null) {
                it.appendNode(CampoMap["${it.@id}"].sites)
            }
            out << it
        }
    }
} .writeTo(is.newWriter(Encod))
}
  dataContext.storeStream(is, props);

新代码是

import groovy.util.XmlParser
import groovy.xml.MarkupBuilder

def parser = new XmlParser()
def writer = new StringWriter()
def builder = new MarkupBuilder(writer)

for( int i = 0; i < dataContext.getDataCount(); i++ ) {
  InputStream is = dataContext.getStream(i);
  Properties props = dataContext.getProperties(i);

def mergedDocument = (0..<dataContext.dataCount)
    .collect { XmlParser.parse(dataContext.getStream(it)) }
    .inject { nodeA, nodeB -> merge(nodeA, nodeB) }


builder.mkp.xmlDeclaration(version:'1.0', encoding:'UTF-8')
builder.EMPLEADOS {
    doc1.EMPLEADO.each { empleado ->
        EMPLEADO(empleado.attributes()) {
           empleado.PROCESO.each { proceso -> 
               PROCESO(proceso.attributes()) 
           }

           empleado.SECCION.each { seccion ->
               SECCION(seccion.attributes()) {
                   seccion.CAMPO.each { campo ->
                       CAMPO(campo.attributes(), campo.value().head())
                   }
               }
           }            
        }
    }
}
is = mergedDocument ;
}


/*
 * Category to simplify XML node comparisons.
 * Basically, two Nodes are equal if their attributes are the same.
 */
// class NodeCategory {
//    static boolean equals(Node me, Node other) {
//        me.attributes() == other.attributes()
//    }

//    static boolean isCase(List<Node> nodes, Node other) {
//      nodes.find { it == other } != null
//    }
//}

/*
 * Merges document b into document a.
 * WARNING: This method is destructive; it modifies document a
 * @Returns a, for convenience
 */
def merge(a, b) {
//    use(NodeCategory) {
        b.EMPLEADO.each { empleado ->
            def existingEmpleado = a.EMPLEADO.find { 
                it == empleado
            }

            if(existingEmpleado) {
                // Empleado already exists, must merge differences.

                // Add any missing PROCESO nodes.
                empleado.PROCESO
                   .findAll { !(it in existingEmpleado.PROCESO) }
                   .with {
                       delegate.each { existingEmpleado.append(it) }
                   }

                // Add any missing SECCION nodes.
                empleado.SECCION
                   .findAll { !(it in existingEmpleado.SECCION) }
                   .with {
                       delegate.each { existingEmpleado.append(it) }
                   }

                // Add any missing CAMPO nodes.
                empleado.SECCION.each { seccion ->
                    existingEmpleado.SECCION
                        .find { it == seccion }
                        .with {
                            seccion.CAMPO
                                .findAll { !(it in delegate.CAMPO) }
                                .each { delegate.append(it) }
                        }
                }
            } else {
                // Empleado does not exist, go ahead and add it as-is.
                a.append(empleado)
            }
        }    
 //   }

    return a
}

groovy

回答 1

Stack Overflow用户

发布于 2016-03-02 22:59:24

首先，我应该提到，由于合并过程是上下文的，所以不可能使用通用的方法来组合XML文档。XML节点的合并方式取决于节点的含义。计算机无法计算出数据的含义，因此，作为程序员，您必须提供说明。尽管如此，下面是如何合并您的文档。

import groovy.util.XmlParser
import groovy.xml.MarkupBuilder

def parser = new XmlParser()
def writer = new StringWriter()
def builder = new MarkupBuilder(writer)

def doc1 = parser.parseText('''<?xml version='1.0' encoding='UTF-8'?>
<EMPLEADOS>
  <EMPLEADO TIPO="A" NUMERO="123">
    <PROCESO PERIODO="201603" TT="MN" PAC="9999" />
    <SECCION ID="ETACIV">
      <CAMPO ID="ETA_ETCNOM" SEC=" " FECHA=" ">abc</CAMPO>
    </SECCION>
  </EMPLEADO> 
</EMPLEADOS>''')

def doc2 = parser.parseText('''<?xml version='1.0' encoding='UTF-8'?>
<EMPLEADOS>
  <EMPLEADO TIPO="A" NUMERO="123">
    <PROCESO PERIODO="201603" TT="MN" PAC="9999" />
    <SECCION ID="SADMIN ">
      <CAMPO ID="SAD_SADESO" SEC=" " FECHA="01/03/2015">01/03/2015</CAMPO>
    </SECCION>
  </EMPLEADO>
</EMPLEADOS>''')

def doc3 = parser.parseText('''<?xml version='1.0' encoding='UTF-8'?>
<EMPLEADOS>
  <EMPLEADO TIPO="A" NUMERO="123">
    <PROCESO PERIODO="201603" TT="MN" PAC="9999" />
    <SECCION ID="SADMIN ">
      <CAMPO ID="SAD_SADESO" SEC=" " FECHA="01/06/2015">01/06/2015</CAMPO>
    </SECCION>
  </EMPLEADO>
</EMPLEADOS>''')


merge(doc1, doc2)
merge(doc1, doc3)

builder.mkp.xmlDeclaration(version:'1.0', encoding:'UTF-8')
builder.EMPLEADOS {
    doc1.EMPLEADO.each { empleado ->
        EMPLEADO(empleado.attributes()) {
           empleado.PROCESO.each { proceso -> 
               PROCESO(proceso.attributes()) 
           }

           empleado.SECCION.each { seccion ->
               SECCION(seccion.attributes()) {
                   seccion.CAMPO.each { campo ->
                       CAMPO(campo.attributes(), campo.value().head())
                   }
               }
           }            
        }
    }
}

println writer

/*
 * Category to simplify XML node comparisons.
 * Basically, two Nodes are equal if their attributes are the same.
 */
class NodeCategory {
    static boolean equals(Node me, Node other) {
        me.attributes() == other.attributes()
    }

    static boolean isCase(List<Node> nodes, Node other) {
        nodes.find { it == other } != null
    }
}

/*
 * Merges document b into document a.
 * WARNING: This method is destructive; it modifies document a
 * @Returns a, for convenience
 */
def merge(a, b) {
    use(NodeCategory) {
        b.EMPLEADO.each { empleado ->
            def existingEmpleado = a.EMPLEADO.find { 
                it == empleado
            }

            if(existingEmpleado) {
                // Empleado already exists, must merge differences.

                // Add any missing PROCESO nodes.
                empleado.PROCESO
                   .findAll { !(it in existingEmpleado.PROCESO) }
                   .with {
                       delegate.each { existingEmpleado.append(it) }
                   }

                // Add any missing SECCION nodes.
                empleado.SECCION
                   .findAll { !(it in existingEmpleado.SECCION) }
                   .with {
                       delegate.each { existingEmpleado.append(it) }
                   }

                // Add any missing CAMPO nodes.
                empleado.SECCION.each { seccion ->
                    existingEmpleado.SECCION
                        .find { it == seccion }
                        .with {
                            seccion.CAMPO
                                .findAll { !(it in delegate.CAMPO) }
                                .each { delegate.append(it) }
                        }
                }
            } else {
                // Empleado does not exist, go ahead and add it as-is.
                a.append(empleado)
            }
        }    
    }

    return a
}

这个过程是这样的：

merge(Node a, Node b)方法遍历节点，处理每一种情况，使a最终成为两个文档(节点树)的组合。它的基础是确定b中的节点是否已经在a中。如果没有，则按原样添加节点。否则，相应地合并更改a。是的，这个方法太丑了，写起来实在太难了。拜托，为了健全的节目，改造野兽。
最后，使用一个MarkupDocumentBuilder来处理最终节点树并生成一个序列化的XML文档。

您可能会注意到涉及到一个Groovy类别。它用于简化Node比较。

增编-使用输入流

您可以使用InputStream作为XML文档的源来调用相同的进程。会是这样的：

def parser = new XmlParser()
def mergedDocument = (0..<dataContext.dataCount)
    .collect { parser.parse(dataContext.getStream(it) }
    .inject { nodeA, nodeB -> merge(nodeA, nodeB) }

然后，您可以使用mergedDocument处理MarkupBuilder。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/35737966

复制

相似问题

问XML组合文档
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问XML组合文档EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问XML组合文档
EN