首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >BigQuery中的递归/分层查询

BigQuery中的递归/分层查询
EN

Stack Overflow用户
提问于 2020-01-24 04:41:42
回答 3查看 9K关注 0票数 6

我有一个递归/分层问题,我试图在BigQuery中解决这个问题。

我有一个员工列表,每个员工都有一个经理ID,我需要能够输入一个Employee_ID并返回他们下面每个人的数组。

代码语言:javascript
运行
复制
CREATE TABLE p_RLS.testHeirarchy
 (
   Employee_ID INT64,
   Employee_Name STRING,
   Position STRING,
   Line_Manager_ID INT64
 );

INSERT INTO p_RLS.testHeirarchy (Employee_ID, Employee_Name, Position, Line_Manager_ID)
VALUES(1,'Joe','Worker',11),
      (2,'James','Worker',11),
      (3,'Jack','Worker',11),
      (4,'Jill','Worker',12),
      (5,'Jan','Worker',12),
      (6,'Jacquie','Worker',13),
      (7,'Joaquin','Worker',14),
      (8,'Jeremy','Worker',14),
      (9,'Jade','Worker',15),
      (10,'Jocelyn','Worker',15),
      (11, 'Bob', 'Store Manager',16),
      (12, 'Bill', 'Store Manager',16),
      (13, 'Barb', 'Store Manager',16),
      (14, 'Ben', 'Store Manager',17),
      (15, 'Burt', 'Store Manager',17),
      (16, 'Sally','Group Manager',18),
      (17, 'Sam','Group Manager',19),
      (18, 'Anna', 'Ops Manager',20),
      (19, 'Amy', 'Ops Manager',20),
      (20, 'Zoe', 'State Manager', NULL);

我想要的输出类似于:

代码语言:javascript
运行
复制
SELECT 20 as Employee_ID, [19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1] as Reports;
SELECT 11 as Employee_ID, [3,2,1] as Reports;
SELECT 1 as Employee_ID, [] as Reports;

我已经完成了以下工作,但它看起来很难看/很不方便,并且不支持无限级别:

代码语言:javascript
运行
复制
WITH test as (
SELECT L0.Employee_ID, L0.Employee_Name, L0.Position, L0.Line_Manager_ID,
ARRAY_AGG(DISTINCT L1.Employee_ID IGNORE NULLS) as Lvl1, 
ARRAY_AGG(DISTINCT L2.Employee_ID IGNORE NULLS) as Lvl2, 
ARRAY_AGG(DISTINCT L3.Employee_ID IGNORE NULLS) as Lvl3, 
ARRAY_AGG(DISTINCT L4.Employee_ID IGNORE NULLS) as Lvl4, 
ARRAY_AGG(DISTINCT L5.Employee_ID IGNORE NULLS) as Lvl5, 
ARRAY_AGG(DISTINCT L6.Employee_ID IGNORE NULLS) as Lvl6, 
ARRAY_AGG(DISTINCT L7.Employee_ID IGNORE NULLS) as Lvl7
FROM p_RLS.testHeirarchy as L0
LEFT OUTER JOIN p_RLS.testHeirarchy L1 ON L0.Employee_ID = L1.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L2 ON L1.Employee_ID = L2.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L3 ON L2.Employee_ID = L3.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L4 ON L3.Employee_ID = L4.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L5 ON L4.Employee_ID = L5.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L6 ON L5.Employee_ID = L6.Line_Manager_ID
LEFT OUTER JOIN p_RLS.testHeirarchy L7 ON L6.Employee_ID = L7.Line_Manager_ID
WHERE L0.Employee_ID = 16
GROUP BY 1,2,3,4)

SELECT
Employee_ID, ARRAY_CONCAT(
    IFNULL(Lvl1,[]),
    IFNULL(Lvl2,[]),
    IFNULL(Lvl3,[]),
    IFNULL(Lvl4,[]),
    IFNULL(Lvl5,[]),
    IFNULL(Lvl6,[]),
    IFNULL(Lvl7,[])) as All_reports
FROM test

有更好的方法吗?在BigQuery中可以使用递归方法吗?

EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2020-01-24 07:04:37

下面是用于BigQuery标准SQL的

代码语言:javascript
运行
复制
DECLARE rows_count, run_away_stop INT64 DEFAULT 0;

CREATE TEMP TABLE initialData AS WITH input AS (
  SELECT 1 Employee_ID,'Joe' Employee_Name,'Worker' Position,11 Line_Manager_ID UNION ALL
  SELECT 2,'James','Worker',11 UNION ALL
  SELECT 3,'Jack','Worker',11 UNION ALL
  SELECT 4,'Jill','Worker',12 UNION ALL
  SELECT 5,'Jan','Worker',12 UNION ALL
  SELECT 6,'Jacquie','Worker',13 UNION ALL
  SELECT 7,'Joaquin','Worker',14 UNION ALL
  SELECT 8,'Jeremy','Worker',14 UNION ALL
  SELECT 9,'Jade','Worker',15 UNION ALL
  SELECT 10,'Jocelyn','Worker',15 UNION ALL
  SELECT 11, 'Bob', 'Store Manager',16 UNION ALL
  SELECT 12, 'Bill', 'Store Manager',16 UNION ALL
  SELECT 13, 'Barb', 'Store Manager',16 UNION ALL
  SELECT 14, 'Ben', 'Store Manager',17 UNION ALL
  SELECT 15, 'Burt', 'Store Manager',17 UNION ALL
  SELECT 16, 'Sally','Group Manager',18 UNION ALL
  SELECT 17, 'Sam','Group Manager',19 UNION ALL
  SELECT 18, 'Anna', 'Ops Manager',20 UNION ALL
  SELECT 19, 'Amy', 'Ops Manager',20 UNION ALL
  SELECT 20, 'Zoe', 'State Manager', NULL 
)
SELECT * FROM input;

CREATE TEMP TABLE ttt AS 
SELECT Line_Manager_ID, ARRAY_AGG(Employee_ID) Reports FROM initialData WHERE NOT Line_Manager_ID IS NULL GROUP BY Line_Manager_ID;

LOOP
  SET (run_away_stop, rows_count) = (SELECT AS STRUCT run_away_stop + 1, COUNT(1) FROM ttt);

  CREATE OR REPLACE TEMP TABLE ttt1 AS
    SELECT Line_Manager_ID, ARRAY(SELECT DISTINCT Employee_ID FROM UNNEST(Reports) Employee_ID ORDER BY Employee_ID DESC) Reports
    FROM (
      SELECT Line_Manager_ID, ARRAY_CONCAT_AGG(Reports) Reports    
      FROM (
        SELECT t2.Line_Manager_ID, ARRAY_CONCAT(t1.Reports, t2.Reports) Reports 
        FROM ttt t1, ttt t2 
        WHERE (SELECT COUNTIF(t1.Line_Manager_ID = Employee_ID) FROM UNNEST(t2.Reports) Employee_ID) > 0
      ) GROUP BY Line_Manager_ID
    );

  CREATE OR REPLACE TEMP TABLE ttt AS
    SELECT * FROM ttt1 UNION ALL
    SELECT * FROM ttt WHERE NOT Line_Manager_ID IN (SELECT Line_Manager_ID FROM ttt1);

  IF (rows_count = (SELECT COUNT(1) FROM ttt) AND run_away_stop > 1) OR run_away_stop > 10 THEN BREAK; END IF;
END LOOP;

SELECT Employee_ID, 
  (
    SELECT STRING_AGG(CAST(Employee_ID AS STRING), ',' ORDER BY Employee_ID DESC)
    FROM ttt.Reports Employee_ID
  ) Reports_as_list
FROM (SELECT DISTINCT Employee_ID FROM initialData) d
LEFT JOIN ttt ON Employee_ID = Line_Manager_ID
ORDER BY Employee_ID DESC;  

有结果

代码语言:javascript
运行
复制
Row Employee_ID Reports_as_list  
1   20          19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1  
2   19          17,15,14,10,9,8,7    
3   18          16,13,12,11,6,5,4,3,2,1  
4   17          15,14,10,9,8,7   
5   16          13,12,11,6,5,4,3,2,1     
6   15          10,9     
7   14          8,7  
8   13          6    
9   12          5,4  
10  11          3,2,1    
11  10          null     
12  9           null     
13  8           null     
14  7           null     
15  6           null     
16  5           null     
17  4           null     
18  3           null     
19  2           null     
20  1           null    

如果您需要报告作为数组-用下面的脚本替换上面脚本中的最后一条语句

代码语言:javascript
运行
复制
SELECT Employee_ID, Reports Reports_as_array
FROM (SELECT DISTINCT Employee_ID FROM initialData) d
LEFT JOIN ttt ON Employee_ID = Line_Manager_ID
ORDER BY Employee_ID DESC;  

注意:取决于层次结构中嵌套的级别--您可能需要调整10中的OR run_away_stop > 10

票数 7
EN

Stack Overflow用户

发布于 2022-02-06 21:34:43

递归CTE最近是介绍

这使得事情变得容易多了

代码语言:javascript
运行
复制
with recursive iterations as (
  select line_manager_id, employee_id, 1 pos from your_table
  union all 
  select b.line_manager_id, a.employee_id, pos + 1 
  from your_table a join iterations b
  on b.employee_id = a.line_manager_id 
)
select line_manager_id, string_agg('' || employee_id order by pos, employee_id desc) as reports_as_list
from iterations
where not line_manager_id is null
group by line_manager_id
order by line_manager_id desc 

如果应用于问题中的样本数据,则输出为

票数 9
EN

Stack Overflow用户

发布于 2020-01-24 05:04:04

对于这个问题:“在BigQuery中可以使用递归方法吗?”

是!

既然BigQuery支持脚本和循环,我就用BigQuery解决了代码高级版中的一些递归问题:

代码语言:javascript
运行
复制
CREATE TEMP TABLE planets AS SELECT 'YOU' planet;

LOOP
  SET steps = steps+1
  ;
  CREATE OR REPLACE TEMP TABLE planets AS
  SELECT DISTINCT planet
  FROM (
    SELECT origin planet FROM t1 WHERE dest IN (SELECT planet FROM planets)
    UNION ALL
    SELECT dest planet FROM t1 WHERE origin IN (SELECT planet FROM planets)
  )
  ;
  IF 'SAN' IN (SELECT * FROM planets )
      THEN LEAVE;
  END IF;
END LOOP
;
SELECT steps-2

我会使用类似的方法来导航图并注释所有父关系。

很快:我将写一篇关于树遍历的细节的博客文章,让每个人都在x下面,但这段代码将同时帮助您。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/59890459

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档