是否可以将如下所示的复杂嵌套xml添加到hive表中。
<items>
<item id="0001" type="donut">
<name>Cake</name>
<ppu>0.55</ppu>
<batters>
<batter id="1001">Regular</batter>
<batter id="1002">Chocolate</batter>
<batter id="1003">Blueberry</batter>
<batter id="1003">Devil's Food</batter>
</batters>
<topping id="5001">None</topping>
<topping id="5002">Glazed</topping>
<topping id="5005">Sugar</topping>
<topping id="5007">Powdered Sugar</topping>
<topping id="5006">Chocolate with Sprinkles</topping>
<topping id="5003">Chocolate</topping>
<topping id="5004">Maple</topping>
</item>
<item id="0002" type="donut">
<name>Raised</name>
<ppu>0.55</ppu>
<batters>
<batter id="1001">Regular</batter>
</batters>
<topping id="5001">None</topping>
<topping id="5002">Glazed</topping>
<topping id="5005">Sugar</topping>
<topping id="5003">Chocolate</topping>
<topping id="5004">Maple</topping>
</item>
<item id="0003" type="donut">
<name>Buttermilk</name>
<ppu>0.55</ppu>
<batters>
<batter id="1001">Regular</batter>
<batter id="1002">Chocolate</batter>
</batters>
</item>
<item id="0004" type="bar">
<name>Bar</name>
<ppu>0.75</ppu>
<batters>
<batter id="1001">Regular</batter>
</batters>
<topping id="5003">Chocolate</topping>
<topping id="5004">Maple</topping>
<fillings>
<filling id="7001">
<name>None</name>
<addcost>0</addcost>
</filling>
<filling id="7002">
<name>Custard</name>
<addcost>0.25</addcost>
</filling>
<filling id="7003">
<name>Whipped Cream</name>
<addcost>0.25</addcost>
</filling>
</fillings>
</item>
<item id="0005" type="twist">
<name>Twist</name>
<ppu>0.65</ppu>
<batters>
<batter id="1001">Regular</batter>
</batters>
<topping id="5002">Glazed</topping>
<topping id="5005">Sugar</topping>
</item>
<item id="0006" type="filled">
<name>Filled</name>
<ppu>0.75</ppu>
<batters>
<batter id="1001">Regular</batter>
</batters>
<topping id="5002">Glazed</topping>
<topping id="5007">Powdered Sugar</topping>
<topping id="5003">Chocolate</topping>
<topping id="5004">Maple</topping>
<fillings>
<filling id="7002">
<name>Custard</name>
<addcost>0</addcost>
</filling>
<filling id="7003">
<name>Whipped Cream</name>
<addcost>0</addcost>
</filling>
<filling id="7004">
<name>Strawberry Jelly</name>
<addcost>0</addcost>
</filling>
<filling id="7005">
<name>Rasberry Jelly</name>
<addcost>0</addcost>
</filling>
</fillings>
</item>
</items>我已经能够映射到1001,1002,1003,但是相同的值,我无法提取。我加载了xml to hive表,并使用xpath进行提取。我需要正常的,巧克力的,蓝莓的。
我将以下内容添加到配置子表(store.choclate)和查询中,如下所示
从store.chocolate中选择xpath(字符串,'/items/item/batters/batter/@id')
这将给出值1001、1002、1003。如何编写一个查询来提取常规的、巧克力的和蓝宝石的?
发布于 2018-02-19 23:24:51
与查询中一样,将xml数据加载为单个表,并在其上创建视图。查询的结构如下
select xpath(str, '/items/item/batters/batter[@id="1001"]/text()')这将提取'Regular‘的值。在类似的基础上,也可用于其他领域。
https://stackoverflow.com/questions/48826485
复制相似问题