我们正在接收来自客户的数据,并且它是高度嵌套的(客户这样做是为了减少重复数据的数量,从而减少数据传输的总体大小)。
数据如下:
{
"r1": "ex",
"r2": "of",
"r3": "da",
"ver": 1,
"noa1": [{
"col1": 380,
"noa2": [{
"aon": [123],
"obj": {
"col10": "stuf",
"col11": "here",
"aos": ["A",
"X"
]
}
}, {
"aon": [],
"obj": {
"col10": "more",
"col11": "stuf"
}
}, {
"aon": [456, 789]
}, {
"obj": {
"col10": "anon"
}
}]
}, {
"col1": 5676,
"noa2": [{
"aon": [875],
"obj": {
"col10": "does",
"col11": "noth",
"aos": ["Y",
"Z"
]
}
}, {
"obj": {
"col11": "phew"
}
}]
}]
}其中有许多可选的嵌套对象和数组(大约6层深IRL,但在这里简化)。
我们必须将其扁平化为CSV,其中每一行都是唯一的,其中包含JSON对象的数据。我们不希望从跨产品连接中创建任何“新行”。因此,上面的内容如下:
r1|r2|r3|ver|col1|aon|col10|col11|aos
ex|of|da|1 |380 |123|stuf |here |A
ex|of|da|1 |380 |123|stuf |here |X
ex|of|da|1 |380 | |more |stuf |
ex|of|da|1 |380 |456| | |
ex|of|da|1 |380 |789| | |
ex|of|da|1 |380 | | anon| |
ex|of|da|1 |5676|875|does |noth |Y
ex|of|da|1 |5676|875|does |noth |Z
ex|of|da|1 |5676| | |phew |
For brevity, we show this as a table. In reality, we would want this to be a collection of JSON documents (jq -c).我们一直在使用jq,但是我们一直在处理不是对象的嵌套数组,比如字符串数组。它导致交叉连接发生,将值放置在原始JSON中不存在的位置。我们在可选嵌套行方面也有问题(如果数组为null,jq当前没有打印行,即使是在?改性剂。
问题如下:
中的嵌套对象、数组和字符串、数字数组等。
更新:以下代码
.r1 as $r1 | .r2 as $r2 | .r3 as $r3 | .ver as $ver | .noa1[] | . as $noa1 | .noa2[] | . as $noa2 | {$r1, $r2, $r3, $ver, "col1": $noa1.col1, "aon": $noa2.aon }返回
{"r1":"ex","r2":"of","r3":"da","ver":1,"col1":380,"aon":[123]}
{"r1":"ex","r2":"of","r3":"da","ver":1,"col1":380,"aon":[]}
{"r1":"ex","r2":"of","r3":"da","ver":1,"col1":380,"aon":[456,789]}
{"r1":"ex","r2":"of","r3":"da","ver":1,"col1":380,"aon":null}
{"r1":"ex","r2":"of","r3":"da","ver":1,"col1":5676,"aon":[875]}
{"r1":"ex","r2":"of","r3":"da","ver":1,"col1":5676,"aon":null}这很有道理。但是现在,当我们试图进一步爆炸"aon“数列时,它就变得奇怪了。这段代码
.r1 as $r1 | .r2 as $r2 | .r3 as $r3 | .ver as $ver | .noa1[] | . as $noa1 | .noa2[] | . as $noa2 |
{$r1, $r2, $r3, $ver, "col1": $noa1.col1, "aon": $noa2.aon[]? }丢失具有"aon“空值的行,我可以理解,这是某种程度上(因为?)。但是这段代码失败了
.r1 as $r1 | .r2 as $r2 | .r3 as $r3 | .ver as $ver | .noa1[] | . as $noa1 | .noa2[] | . as $noa2 |
{$r1, $r2, $r3, $ver, "col1": $noa1.col1, "aon": $noa2.aon[] }我也理解,因为空值。所以我们尝试把它作为一个变量来引用:
.r1 as $r1
| .r2 as $r2
| .r3 as $r3
| .ver as $ver
| .noa1[] | . as $noa1
| .noa2[] | . as $noa2
| .obj as $obj
| .aon[]? | . as $aon
| {$r1, $r2, $r3, $ver, "col1": $noa1.col1, "aon": $aon, "col10": $obj.col10, "col11": $obj.col11 }走到这一步。但是有几个问题:我们丢失了一些aon为null的行,并且我们不知道如何访问"aos“。
发布于 2022-10-17 22:37:03
我不确定您是否希望"aos“字段出现在每个生成的对象中。如果没有,你可以做的比以下更糟:
jq -c '
.r1 as $r1
| .r2 as $r2
| .r3 as $r3
| .ver as $ver
| .noa1[]
| .col1 as $col1
| .noa2[]
| .aon as $aon
| .obj
| {$r1, $r2, $ver, $col1, $aon, col10, col11}
+ try {aos: .aos[]} catch {} '如果您这样做了,那么您可以用{aos: null}或其他任何东西来替换最后一个{aos: null}。如果您不喜欢使用try ... catch,那么您可以用if语句替换它,或者用以下方式替换它:
( (.aos // [null]) | {aos: .[]} )我认为,保持理智的关键是避免创建jq变量,除非在“向下钻”时需要这样做。此外,jq对缩写(如{$x}和{y} )的支持使阅读和检查变得更容易。
https://stackoverflow.com/questions/74103310
复制相似问题