跟进这件事- Bigquery multiple unnest in a single select
我们使用bigquery作为我们的仓库解决方案,并试图通过合并来突破限制。一个简单的例子是客户端跟踪。客户端产生收入,在我们的网站上有几个接触点,并独立地在我们的几个帐户。对于希望对客户进行行为分析的业务用户,他们希望跟踪访问、收入以及他们的帐户impacT保留情况,我们正试图评估嵌套结构是否适合我们。
下面是一个例子。我有三张桌子。
客户(C)
C_Key| C_Name
1%- ABC
2. DEF
账户(A)
A_Key x- C_Key
11 -1
12 -1
21 -2
22 -2
23 -2
收入(R)
R_Key / C_Key -收入
11 \x{1} $10
12 %1美元20美元
21 -2-10美元
我使用array_agg将这三者组合成一个嵌套表,如下所示:
{Client,
Accounts:
[{
}],
Revenue:
[{
}]
}
我希望能够在一个查询中使用多个取消嵌套,如下所示
Select client, Count Distinct(Accounts) and SUM(Revenue) from <single nested
table>, unnest accounts, unnest revenue
预期的输出为2行,
1,2,30美元
2,3,10美元
但是,在同一个查询中有多个取消嵌套会导致交叉连接。
实际输出是
1,2,60美元
2,3,30美元
发布于 2017-08-10 05:28:04
下面是用于BigQuery标准SQL的
首先,让我们澄清single nested table
的创建
我希望你做了这样的事:
#standardSQL
WITH clients AS (
SELECT 1 AS c_key, 'abc' AS c_name UNION ALL
SELECT 2, 'def'
), accounts AS (
SELECT 11 AS a_key, 1 AS c_key UNION ALL
SELECT 12, 1 UNION ALL
SELECT 21, 2 UNION ALL
SELECT 22, 2 UNION ALL
SELECT 23, 2
), revenue AS (
SELECT 11 AS r_key, 1 AS c_key, 10 AS revenue UNION ALL
SELECT 12, 1, 20 UNION ALL
SELECT 21, 2, 10
), single_nested_table AS (
SELECT x.c_key, x.c_name, accounts, revenue
FROM (
SELECT c.c_key, c_name, ARRAY_AGG(a) AS accounts --, array_agg(r) as revenue
FROM clients AS c
LEFT JOIN accounts AS a ON a.c_key = c.c_key
GROUP BY c.c_key, c_name
) x
JOIN (
SELECT c.c_key, c_name, ARRAY_AGG(r) AS revenue
FROM clients AS c
LEFT JOIN revenue AS r ON r.c_key = c.c_key
GROUP BY c.c_key, c_name
) y
ON x.c_key = y.c_key
)
SELECT *
FROM single_nested_table
它将表创建为
Row c_key c_name accounts.a_key accounts.c_key revenue.r_key revenue.c_key revenue.revenue
1 1 abc 11 1 11 1 10
12 1 12 1 20
2 2 def 21 2 21 2 10
22 2
23 2
您使用什么查询来创建该表并不那么重要--但是要清除结构/模式非常重要!
所以现在,回到你的问题
#standardSQL
WITH clients AS (
SELECT 1 AS c_key, 'abc' AS c_name UNION ALL
SELECT 2, 'def'
), accounts AS (
SELECT 11 AS a_key, 1 AS c_key UNION ALL
SELECT 12, 1 UNION ALL
SELECT 21, 2 UNION ALL
SELECT 22, 2 UNION ALL
SELECT 23, 2
), revenue AS (
SELECT 11 AS r_key, 1 AS c_key, 10 AS revenue UNION ALL
SELECT 12, 1, 20 UNION ALL
SELECT 21, 2, 10
), single_nested_table AS (
SELECT x.c_key, x.c_name, accounts, revenue
FROM (
SELECT c.c_key, c_name, ARRAY_AGG(a) AS accounts --, array_agg(r) as revenue
FROM clients AS c
LEFT JOIN accounts AS a ON a.c_key = c.c_key
GROUP BY c.c_key, c_name
) x
JOIN (
SELECT c.c_key, c_name, ARRAY_AGG(r) AS revenue
FROM clients AS c
LEFT JOIN revenue AS r ON r.c_key = c.c_key
GROUP BY c.c_key, c_name
) y
ON x.c_key = y.c_key
)
SELECT
c_key, c_name,
ARRAY_LENGTH(accounts) AS distinct_accounts,
(SELECT SUM(revenue) FROM UNNEST(revenue)) AS revenue
FROM single_nested_table
这给了你想要的东西:
Row c_key c_name distinct_accounts revenue
1 1 abc 2 30
2 2 def 3 10
https://stackoverflow.com/questions/45574834
复制相似问题