我有以下嵌套的json文件,我想用jq工具解析它并以表格的形式打印出来,就像我在最后展示的那样。
input.json结构如下所示:
{
"document":{
"page":[
{
"@index":"0",
"image":{
"@data":"ABC",
"@format":"png",
"@height":"620.00",
"@type":"base64encoded",
"@width":"450.00",
"@x":"85.00",
"@y":"85.00"
}
},
{
"@index":"1",
"row":[
{
"column":[
{
"text":""
},
{
"text":{
"#text":"Text1",
"@fontName":"Arial",
"@fontSize":"12.0",
"@height":"12.00",
"@width":"71.04",
"@x":"121.10",
"@y":"83.42"
}
}
]
},
{
"column":[
{
"text":""
},
{
"text":{
"#text":"Text2",
"@fontName":"Arial",
"@fontSize":"12.0",
"@height":"12.00",
"@width":"101.07",
"@x":"121.10",
"@y":"124.82"
}
}
]
}
]
},
{
"@index":"2",
"row":[
{
"column":{
"text":{
"#text":"Text3",
"@fontName":"Arial",
"@fontSize":"12.0",
"@height":"12.00",
"@width":"363.44",
"@x":"85.10",
"@y":"69.62"
}
}
},
{
"column":{
"text":{
"#text":"Text4",
"@fontName":"Arial",
"@fontSize":"12.0",
"@height":"12.00",
"@width":"382.36",
"@x":"85.10",
"@y":"83.42"
}
}
},
{
"column":{
"text":{
"#text":"Text5",
"@fontName":"Arial",
"@fontSize":"12.0",
"@height":"12.00",
"@width":"435.05",
"@x":"85.10",
"@y":"97.22"
}
}
}
]
},
{
"@index":"3"
}
]
}
}
根据以下问题(Parsing nested json with jq)的答案,我已经尝试了此代码,但不起作用
$ cat file.json | jq .document.page[].row | ["#text", "@x", "@y"] | @csv
我想要得到的输出是:
#text @x @y
Text1 121.10 83.42
Text2 121.10 124.82
Text3 65.10 69.62
Text4 85.10 83.42
Text5 85.10 97.22
如何才能做到这一点?
谢谢
更新
非常感谢你的帮助。我用真实的文件尝试了更长的时间。
我能够采用第一个峰值的解决方案,如下所示:
["#text", "@data", "@fontName", "@fontSize", "@format", "@height", "@type", "@width", "@x", "@y"],
( ..
| objects
| select(has("#text","@data"))
| [.["#text", "@data", "@fontName", "@fontSize", "@format", "@height", "@type", "@width", "@x", "@y"]]
)
| @tsv
有了新的输入,我得到了这个表:
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| #text | @data | @fontName | @fontSize | @format | @height | @type | @width | @x | @y |
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| | ABC | | | png | 620 | base64encoded | 450 | 85 | 85 |
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| Text ä 1 | | Tahoma | 12 | | 12 | | 427.79 | 85.1 | 69.62 |
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| Text ¢76 | | Tahoma | 12 | | 12 | | 270.5 | 85.1 | 690.72 |
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| Text % 5 | | Tahoma | 12 | | 12 | | 130.84 | 358.86 | 690.72 |
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| Text 7Ç8 | | Tahoma | 12 | | 12 | | 115.95 | 85.1 | 704.52 |
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| Text • 2 Wñ79 | | Tahoma | 8 | | 8.04 | | 398.16 | 121.1 | 68.06 |
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| Text | | Tahoma | 12 | | 12 | | 101.5 | 85.1 | 83.42 |
| » 1 A\\\\CÓ | | | | | | | | | |
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| Text 12 | | Tahoma | 12 | | 12 | | 312.26 | 189.83 | 83.42 |
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| Text 82 | | Tahoma | 12 | | 12 | | 44.99 | 85.1 | 97.22 |
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| Text 31 | | Tahoma | 8 | | 8.04 | | 381.83 | 133.1 | 95.66 |
+---------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
如果可能,如何添加以下3列(计数器、页和行)以了解每行对应的页和行?
预期输出将如下所示:
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| counter | page | row | #text | @data | @fontName | @fontSize | @format | @height | @type | @width | @x | @y |
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| 1 | 0 | | | ABC | | | png | 620 | base64encoded | 450 | 85 | 85 |
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| 2 | 1 | 0 | Text ä 1 | | Tahoma | 12 | | 12 | | 427.79 | 85.1 | 69.62 |
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| 3 | 1 | 1 | Text ¢76 | | Tahoma | 12 | | 12 | | 270.5 | 85.1 | 690.72 |
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| 4 | 1 | 1 | Text % 5 | | Tahoma | 12 | | 12 | | 130.84 | 358.86 | 690.72 |
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| 5 | 2 | 2 | Text 7Ç8 | | Tahoma | 12 | | 12 | | 115.95 | 85.1 | 704.52 |
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| 6 | 2 | 0 | Text • 2 Wñ79 | | Tahoma | 8 | | 8.04 | | 398.16 | 121.1 | 68.06 |
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| 7 | 2 | 1 | Text » 1 A\\\\CÓ | | Tahoma | 12 | | 12 | | 101.5 | 85.1 | 83.42 |
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| 8 | 2 | 1 | Text 12 | | Tahoma | 12 | | 12 | | 312.26 | 189.83 | 83.42 |
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| 9 | 2 | 2 | Text 82 | | Tahoma | 12 | | 12 | | 44.99 | 85.1 | 97.22 |
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
| 10 | 2 | 2 | Text 31 | | Tahoma | 8 | | 8.04 | | 381.83 | 133.1 | 95.66 |
+-------+------+-----+-------------------+-------+-----------+-----------+---------+---------+---------------+--------+--------+--------+
这是一个新的更具代表性的输入文件input2.json。
通过查看下图中的Json结构,可以了解json文件中存在的page
编号和row
编号以及其中的值。
发布于 2019-05-31 06:56:14
处理input2.json
由于input2.json对应的第二组需求需要一些上下文相关的信息,因此不能忽略上下文,因此下面的解决方案使用“向下钻取”方法。除非您理解foreach
,否则下面的代码会有点难以理解,所以我只想提一下,该方法基本上使用了一个状态变量{counter,page,row}来跟踪这三个计数器。
["counter", "page", "row", "#text", "@data", "@fontName", "@fontSize", "@format", "@height", "@type", "@width", "@x", "@y"],
(foreach (.document.page[] | objects) as $page ({page: -1, counter: 0};
.page += 1
| foreach ($page | .row[]?) as $row (.row=-1;
.row += 1
| foreach ($row | (.column | (if type == "array" then .[] else . end )) | .text | objects) as $x (.;
.counter += 1
| .out = [.counter, .page, .row, $x["#text", "@data", "@fontName", "@fontSize", "@format", "@height", "@type", "@width", "@x", "@y"]]
; . )
; . )
; .out )
)
| @tsv
这会产生所需的TSV,但第一行数据除外,因为它没有行。我在Relate elements in table form from Json file with jq的答案中给出了包含第一行的一种方法
发布于 2019-05-30 14:18:38
这里有一个简单的(也许太简单了?)专注于具有"#text“属性的嵌入式JSON对象的方法:
["#text", "@x", "@y"], # the header
( ..
| objects
| select(has("#text"))
| [.["#text", "@x", "@y"]] # a row
)
| @csv
当给定此程序和示例输入时,使用-r选项调用jq
将生成:
"#text","@x","@y"
"Text1","121.10","83.42"
"Text2","121.10","124.82"
"Text3","85.10","69.62"
"Text4","85.10","83.42"
"Text5","85.10","97.22"
如果您不想要引号,并且愿意冒着输出不是严格意义上的CSV的风险,那么一种选择是在管道的末尾使用join(",")
而不是@csv
。
变体
您可能希望使用@tsv
而不是@csv
。
如果需要一种更严格的方法来选择相关的嵌入式对象,那么也许用.. | .text?
替换..
就足够了。
如果没有,可以根据具体要求添加额外的过滤器。
发布于 2019-05-30 14:33:16
这是一个使用“向下钻取”的解决方案,因此相当单调乏味:
["#text", "@x", "@y"],
( .document.page[]
| .row[]?
| .column
| (if type == "array" then .[] else . end)
| .text
| objects
| [.["#text", "@x", "@y"]]
)
| @tsv
这将与-r命令行选项一起使用。
我使用了@tsv
,因为它产生的输出类似于给定的预期输出。正如本页其他地方所提到的,还有其他选择,例如使用join/1
。
https://stackoverflow.com/questions/56370993
复制相似问题