首页
学习
活动
专区
圈层
工具
发布
社区首页 >专栏 >【建模案例】通过日志查询每日时点状态

【建模案例】通过日志查询每日时点状态

作者头像
数据仓库晨曦
发布2026-06-23 17:22:04
发布2026-06-23 17:22:04
810
举报
文章被收录于专栏:数据仓库技术数据仓库技术
  • 这条 SQL 难度不低,能轻松写出来,足以证明你的技术功底扎实;但若是日常业务里非要写这类复杂 SQL,往往意味着数仓建模环节出了问题。

  • 本文同步发表在数据仓库技术网站dwsql.com 的数仓建模->数据仓库建模案例 下
  • 如果该需求作为面试题目,要求使用非笛卡尔积的方式写SQL,难度应该数据困难,要高过一般的连续问题,所以不要轻易拿去考别人

背景

这个内容是群友问我的一个问题,他已经解决了,但是通过表关联,使用笛卡尔积的方式进行处理,为了进一步优化,他问我有没有不用笛卡尔积的方式计算。

1.需求内容

用户盈利记录表 t_case1_ods_user_profit,包含三个字段,分别为交易日期,用户ID,用户当日盈利金额(如果亏损则为负数);样例数据如下:

代码语言:javascript
复制
+-------------+----------+---------+
| trade_date  | user_id  | profit  |
+-------------+----------+---------+
| 2026-03-12  | 1001     | 3.00    |
| 2026-03-13  | 1002     | 5.00    |
| 2026-03-13  | 1001     | -5.00   |
| 2026-03-14  | 1003     | 7.00    |
| 2026-03-14  | 1004     | 7.00    |
| 2026-03-14  | 1002     | 7.00    |
| 2026-03-15  | 1002     | -7.00   |
| 2026-03-15  | 1001     | -7.00   |
+-------------+----------+---------+

建表语句

代码语言:javascript
复制
create table t_case1_ods_user_profit(
trade_date string COMMENT '交易日期 yyyy-mm-dd',
user_id    bigint comment '用户ID',
profit     decimal(10,2) COMMENT '盈利金额'
) COMMENT='用户每日盈亏表'; 

INSERT INTO TABLE t_case1_ods_user_profit VALUES
('2026-03-12', 1001, 3),
('2026-03-13', 1002, 5),
('2026-03-13', 1001, -5),
('2026-03-14', 1003, 7),
('2026-03-14', 1004, 7),
('2026-03-14', 1002, 7),
('2026-03-15', 1002, -7),
('2026-03-15', 1001, -7); 

需求内容: 希望能够查询下面数据

用户盈利记录表中,记录了每日有交易过程的用户当期盈利金额,如果亏损则记为负数。现希望查询出截止到每日的累积盈利的用户数;

分析: 因为盈利记录表中仅存在当日有交易记录,这样我们进行累积求和的结果是不能满足要求的,这是该题目困难的原因。

SQL 解法

我们先使用SQL完成需求内容。

方法一:

既然题目中没有,那么我们将其补全,即为每个用户补全所有日期的交易记录,如果用户不存在交易记录,则补当日盈亏金额为0;

  • 1.先做笛卡尔积,算出一个全量用户全日期,为0的交易记录
代码语言:javascript
复制
select t_date.trade_date,
       t_user.user_id,
       0 as profit
from (select trade_date
      from t_case1_ods_user_profit
      group by trade_date) t_date
         join (select user_id
               from t_case1_ods_user_profit
               group by user_id) t_user;

执行结果

代码语言:javascript
复制
+-------------+----------+---------+
| trade_date  | user_id  | profit  |
+-------------+----------+---------+
| 2026-03-12  | 1002     | 0       |
| 2026-03-12  | 1001     | 0       |
| 2026-03-12  | 1003     | 0       |
| 2026-03-12  | 1004     | 0       |
| 2026-03-14  | 1002     | 0       |
| 2026-03-14  | 1001     | 0       |
| 2026-03-14  | 1003     | 0       |
| 2026-03-14  | 1004     | 0       |
| 2026-03-13  | 1002     | 0       |
| 2026-03-13  | 1001     | 0       |
| 2026-03-13  | 1003     | 0       |
| 2026-03-13  | 1004     | 0       |
| 2026-03-15  | 1002     | 0       |
| 2026-03-15  | 1001     | 0       |
| 2026-03-15  | 1003     | 0       |
| 2026-03-15  | 1004     | 0       |
+-------------+----------+---------+
16 rows selected (10.268 seconds)(dwsql.com)
  • 2.全量交易记录与原始交易记录进行合并,如果用户有交易记录就取实际交易记录,否则使用补充的当日盈亏为0的交易记录;
代码语言:javascript
复制
select trade_date,
       user_id,
       sum(profit) as profit
from (select t_date.trade_date,
             t_user.user_id,
             0 as profit
      from (select trade_date
            from t_case1_ods_user_profit
            group by trade_date) t_date
               join (select user_id
                     from t_case1_ods_user_profit
                     group by user_id) t_user
      union all
      select trade_date, user_id, profit
      from t_case1_ods_user_profit) t
group by trade_date,
         user_id
order by trade_date,
         user_id --为方便查看数据添加

执行结果

代码语言:javascript
复制
+-------------+----------+---------+
| trade_date  | user_id  | profit  |
+-------------+----------+---------+
| 2026-03-12  | 1001     | 3.00    |
| 2026-03-12  | 1002     | 0.00    |
| 2026-03-12  | 1003     | 0.00    |
| 2026-03-12  | 1004     | 0.00    |
| 2026-03-13  | 1001     | -5.00   |
| 2026-03-13  | 1002     | 5.00    |
| 2026-03-13  | 1003     | 0.00    |
| 2026-03-13  | 1004     | 0.00    |
| 2026-03-14  | 1001     | 0.00    |
| 2026-03-14  | 1002     | 7.00    |
| 2026-03-14  | 1003     | 7.00    |
| 2026-03-14  | 1004     | 7.00    |
| 2026-03-15  | 1001     | -7.00   |
| 2026-03-15  | 1002     | -7.00   |
| 2026-03-15  | 1003     | 0.00    |
| 2026-03-15  | 1004     | 0.00    |
+-------------+----------+---------+
16 rows selected (1.249 seconds)(dwsql.com)
  • 3.累积求和得到每个用户截止到每天的的累积盈利金额;
代码语言:javascript
复制
select trade_date,
       user_id,
       profit,
       sum(profit) over (partition by user_id order by trade_date asc) as total_profit --关键开窗函数
from (select trade_date,
             user_id,
             sum(profit) as profit
      from (select t_date.trade_date,
                   t_user.user_id,
                   0 as profit
            from (select trade_date
                  from t_case1_ods_user_profit
                  group by trade_date) t_date
                     join (select user_id
                           from t_case1_ods_user_profit
                           group by user_id) t_user
            union all
            select trade_date, user_id, profit
            from t_case1_ods_user_profit) t
      group by trade_date,
               user_id) tt

执行结果

代码语言:javascript
复制
+-------------+----------+---------+---------------+
| trade_date  | user_id  | profit  | total_profit  |
+-------------+----------+---------+---------------+
| 2026-03-12  | 1001     | 3.00    | 3.00          |
| 2026-03-13  | 1001     | -5.00   | -2.00         |
| 2026-03-14  | 1001     | 0.00    | -2.00         |
| 2026-03-15  | 1001     | -7.00   | -9.00         |
| 2026-03-12  | 1002     | 0.00    | 0.00          |
| 2026-03-13  | 1002     | 5.00    | 5.00          |
| 2026-03-14  | 1002     | 7.00    | 12.00         |
| 2026-03-15  | 1002     | -7.00   | 5.00          |
| 2026-03-12  | 1003     | 0.00    | 0.00          |
| 2026-03-13  | 1003     | 0.00    | 0.00          |
| 2026-03-14  | 1003     | 7.00    | 7.00          |
| 2026-03-15  | 1003     | 0.00    | 7.00          |
| 2026-03-12  | 1004     | 0.00    | 0.00          |
| 2026-03-13  | 1004     | 0.00    | 0.00          |
| 2026-03-14  | 1004     | 7.00    | 7.00          |
| 2026-03-15  | 1004     | 0.00    | 7.00          |
+-------------+----------+---------+---------------+
16 rows selected (0.952 seconds)(dwsql.com)
  • 4.统计每天累积盈利金额>0的用户数;
代码语言:javascript
复制
select trade_date, count(case when total_profit > 0 then user_id end) as user_cnt
from (select trade_date,
             user_id,
             profit,
             sum(profit) over (partition by user_id order by trade_date asc) as total_profit
      from (select trade_date,
                   user_id,
                   sum(profit) as profit
            from (select t_date.trade_date,
                         t_user.user_id,
                         0 as profit
                  from (select trade_date
                        from t_case1_ods_user_profit
                        group by trade_date) t_date
                           join (select user_id
                                 from t_case1_ods_user_profit
                                 group by user_id) t_user
                  union all
                  select trade_date, user_id, profit
                  from t_case1_ods_user_profit) t
            group by trade_date,
                     user_id) tt) ttt
group by trade_date
order by trade_date asc;

执行结果

代码语言:javascript
复制
+-------------+-----------+
| trade_date  | user_cnt  |
+-------------+-----------+
| 2026-03-12  | 1         |
| 2026-03-13  | 1         |
| 2026-03-14  | 3         |
| 2026-03-15  | 3         |
+-------------+-----------+
4 rows selected (1.252 seconds)(dwsql.com)

方法二:

前面先做笛卡尔积,太过“暴力” ,会产生大量的数据,产生大量计算;我们稍微优化一下;

  • 1.先对所有用户进行累积求和;
代码语言:javascript
复制
select trade_date,
       user_id,
       profit,
       sum(profit) over (partition by user_id order by trade_date asc) as total_profit --累积求和
from t_case1_ods_user_profit

执行结果

代码语言:javascript
复制
+-------------+----------+---------+---------------+
| trade_date  | user_id  | profit  | total_profit  |
+-------------+----------+---------+---------------+
| 2026-03-12  | 1001     | 3.00    | 3.00          |
| 2026-03-13  | 1001     | -5.00   | -2.00         |
| 2026-03-15  | 1001     | -7.00   | -9.00         |
| 2026-03-13  | 1002     | 5.00    | 5.00          |
| 2026-03-14  | 1002     | 7.00    | 12.00         |
| 2026-03-15  | 1002     | -7.00   | 5.00          |
| 2026-03-14  | 1003     | 7.00    | 7.00          |
| 2026-03-14  | 1004     | 7.00    | 7.00          |
+-------------+----------+---------+---------------+
8 rows selected (0.352 seconds)(dwsql.com)

**说明:**我们没有办法对该数据进行直接统计,因为如果用户在某天不存在交易,则当日不会有其记录,e.g. 1001 用户在3月14日没有记录,所以接下来我们要使用笛卡尔积来完成缺失数据的补足;

  • 2.通过查询记录表,查到所有的日期,日期与累积求和结果进行笛卡尔积计算,限定条件为累积求和日期 <= 维表日期,并根据日期和用户ID进行分组,按照日期倒排进行排序开窗,得到行号
代码语言:javascript
复制
select t_date.trade_date,
       t_profit.trade_date,
       t_profit.user_id,
       t_profit.profit,
       t_profit.total_profit,
       row_number() over (partition by t_date.trade_date, t_profit.user_id order by t_profit.trade_date desc) as rn --注意倒排序
from (select trade_date
      from t_case1_ods_user_profit
      group by trade_date) t_date
         join (select trade_date,
                      user_id,
                      profit,
                      sum(profit) over (partition by user_id order by trade_date asc) as total_profit
               from t_case1_ods_user_profit) t_profit
--注意没有on条件,所以是笛卡尔积
where t_profit.trade_date <= t_date.trade_date

执行结果

代码语言:javascript
复制
+-------------+-------------+----------+---------+---------------+-----+
| trade_date  | trade_date  | user_id  | profit  | total_profit  | rn  |
+-------------+-------------+----------+---------+---------------+-----+
| 2026-03-12  | 2026-03-12  | 1001     | 3.00    | 3.00          | 1   |
| 2026-03-13  | 2026-03-13  | 1001     | -5.00   | -2.00         | 1   |
| 2026-03-13  | 2026-03-12  | 1001     | 3.00    | 3.00          | 2   |
| 2026-03-13  | 2026-03-13  | 1002     | 5.00    | 5.00          | 1   |
| 2026-03-14  | 2026-03-13  | 1001     | -5.00   | -2.00         | 1   |
| 2026-03-14  | 2026-03-12  | 1001     | 3.00    | 3.00          | 2   |
| 2026-03-14  | 2026-03-14  | 1002     | 7.00    | 12.00         | 1   |
| 2026-03-14  | 2026-03-13  | 1002     | 5.00    | 5.00          | 2   |
| 2026-03-14  | 2026-03-14  | 1003     | 7.00    | 7.00          | 1   |
| 2026-03-14  | 2026-03-14  | 1004     | 7.00    | 7.00          | 1   |
| 2026-03-15  | 2026-03-15  | 1001     | -7.00   | -9.00         | 1   |
| 2026-03-15  | 2026-03-13  | 1001     | -5.00   | -2.00         | 2   |
| 2026-03-15  | 2026-03-12  | 1001     | 3.00    | 3.00          | 3   |
| 2026-03-15  | 2026-03-15  | 1002     | -7.00   | 5.00          | 1   |
| 2026-03-15  | 2026-03-14  | 1002     | 7.00    | 12.00         | 2   |
| 2026-03-15  | 2026-03-13  | 1002     | 5.00    | 5.00          | 3   |
| 2026-03-15  | 2026-03-14  | 1003     | 7.00    | 7.00          | 1   |
| 2026-03-15  | 2026-03-14  | 1004     | 7.00    | 7.00          | 1   |
+-------------+-------------+----------+---------+---------------+-----+
18 rows selected (0.809 seconds)
  • 3.取行号= 1 的记录,统计累积盈利金额>0 的用户数据
代码语言:javascript
复制
select new_trade_date,
       count(case when total_profit > 0 then user_id end) as user_cnt
from (select t_date.trade_date                                                                                      as new_trade_date,
             t_profit.trade_date,
             t_profit.user_id,
             t_profit.profit,
             t_profit.total_profit,
             row_number() over (partition by t_date.trade_date, t_profit.user_id order by t_profit.trade_date desc) as rn --注意倒排序
      from (select trade_date
            from t_case1_ods_user_profit
            group by trade_date) t_date
               join (select trade_date,
                            user_id,
                            profit,
                            sum(profit) over (partition by user_id order by trade_date asc) as total_profit
                     from t_case1_ods_user_profit) t_profit
--注意没有on条件,所以是笛卡尔积
      where t_profit.trade_date <= t_date.trade_date) t
where t.rn = 1
group by new_trade_date
order by new_trade_date;

执行结果

代码语言:javascript
复制
+-----------------+-----------+
| new_trade_date  | user_cnt  |
+-----------------+-----------+
| 2026-03-12      | 1         |
| 2026-03-13      | 1         |
| 2026-03-14      | 3         |
| 2026-03-15      | 3         |
+-----------------+-----------+
4 rows selected (11.858 seconds)

方法三:

方法二还是笛卡尔积,数据还是很大,我们换一种思路来计算,避免笛卡尔积。因为只是想要盈利用户数量,并不需要找出具体是哪个用户,所以可以使用累加的方式。

  • 1.先对所有用户进行累积求和;
代码语言:javascript
复制
select trade_date,
       user_id,
       profit,
       sum(profit) over (partition by user_id order by trade_date asc) as total_profit --累积求和
from t_case1_ods_user_profit

执行结果

代码语言:javascript
复制
+-------------+----------+---------+---------------+
| trade_date  | user_id  | profit  | total_profit  |
+-------------+----------+---------+---------------+
| 2026-03-12  | 1001     | 3.00    | 3.00          |
| 2026-03-13  | 1001     | -5.00   | -2.00         |
| 2026-03-15  | 1001     | -7.00   | -9.00         |
| 2026-03-13  | 1002     | 5.00    | 5.00          |
| 2026-03-14  | 1002     | 7.00    | 12.00         |
| 2026-03-15  | 1002     | -7.00   | 5.00          |
| 2026-03-14  | 1003     | 7.00    | 7.00          |
| 2026-03-14  | 1004     | 7.00    | 7.00          |
+-------------+----------+---------+---------------+
8 rows selected (0.352 seconds)(dwsql.com)
  • 2.使用lag函数获得每个用户前一行数据的累计盈利金额
代码语言:javascript
复制
select trade_date,
       user_id,
       profit,
       total_profit,
       lag(total_profit) over (partition by user_id order by trade_date) as last_total_profit
from (select trade_date,
             user_id,
             profit,
             sum(profit) over (partition by user_id order by trade_date asc) as total_profit --累积求和
      from t_case1_ods_user_profit) t

执行结果

代码语言:javascript
复制
+-------------+----------+---------+---------------+--------------------+
| trade_date  | user_id  | profit  | total_profit  | last_total_profit  |
+-------------+----------+---------+---------------+--------------------+
| 2026-03-12  | 1001     | 3.00    | 3.00          | NULL               |
| 2026-03-13  | 1001     | -5.00   | -2.00         | 3.00               |
| 2026-03-15  | 1001     | -7.00   | -9.00         | -2.00              |
| 2026-03-13  | 1002     | 5.00    | 5.00          | NULL               |
| 2026-03-14  | 1002     | 7.00    | 12.00         | 5.00               |
| 2026-03-15  | 1002     | -7.00   | 5.00          | 12.00              |
| 2026-03-14  | 1003     | 7.00    | 7.00          | NULL               |
| 2026-03-14  | 1004     | 7.00    | 7.00          | NULL               |
+-------------+----------+---------+---------------+--------------------+
8 rows selected (8.925 seconds)
  • 3.对比total_profit 和 last_total_profit,根据两者关系做如下判断,添加人数变化列 cnt_change
    • last_total_profit 为空,total_profit >0,则 cnt_change =1
    • last_total_profit 为空,total_profit <0,则 cnt_change = 0
    • last_total_profit >0,total_profit >0,则 cnt_change =0
    • last_total_profit <=0,total_profit <=0, 则 cnt_change =0
    • last_total_profit >0,total_profit <=0 则 cnt_change = - 1
    • last_total_profit <=0,total_profit >0,则 cnt_change = 1
代码语言:javascript
复制
select trade_date,
       user_id,
       profit,
       total_profit,
       last_total_profit,
       case
           when last_total_profit is null and total_profit > 0 then 1
           when last_total_profit is null and total_profit < 0 then 0
           when last_total_profit > 0 and total_profit > 0 then 0
           when last_total_profit <= 0 and total_profit <= 0 then 0
           when last_total_profit > 0 and total_profit <= 0 then -1
           when last_total_profit <= 0 and total_profit > 0 then 1
           end as cnt_change
from (select trade_date,
             user_id,
             profit,
             total_profit,
             lag(total_profit) over (partition by user_id order by trade_date) as last_total_profit
      from (select trade_date,
                   user_id,
                   profit,
                   sum(profit) over (partition by user_id order by trade_date asc) as total_profit --累积求和
            from t_case1_ods_user_profit) t) tt

执行结果

代码语言:javascript
复制
+-------------+----------+---------+---------------+--------------------+-------------+
| trade_date  | user_id  | profit  | total_profit  | last_total_profit  | cnt_change  |
+-------------+----------+---------+---------------+--------------------+-------------+
| 2026-03-12  | 1001     | 3.00    | 3.00          | NULL               | 1           |
| 2026-03-13  | 1001     | -5.00   | -2.00         | 3.00               | -1          |
| 2026-03-15  | 1001     | -7.00   | -9.00         | -2.00              | 0           |
| 2026-03-13  | 1002     | 5.00    | 5.00          | NULL               | 1           |
| 2026-03-14  | 1002     | 7.00    | 12.00         | 5.00               | 0           |
| 2026-03-15  | 1002     | -7.00   | 5.00          | 12.00              | 0           |
| 2026-03-14  | 1003     | 7.00    | 7.00          | NULL               | 1           |
| 2026-03-14  | 1004     | 7.00    | 7.00          | NULL               | 1           |
+-------------+----------+---------+---------------+--------------------+-------------+
8 rows selected (0.52 seconds)(dwsql.com)
  • 4.每日人数变化求和后再对结果累计求和得到结果
代码语言:javascript
复制
select trade_date,
       sum(day_cnt_change) over (order by trade_date) as user_cnt
from (select trade_date,
             sum(cnt_change) as day_cnt_change
      from (select trade_date,
                   user_id,
                   profit,
                   total_profit,
                   last_total_profit,
                   case
                       when last_total_profit is null and total_profit > 0 then 1
                       when last_total_profit is null and total_profit < 0 then 0
                       when last_total_profit > 0 and total_profit > 0 then 0
                       when last_total_profit <= 0 and total_profit <= 0 then 0
                       when last_total_profit > 0 and total_profit <= 0 then -1
                       when last_total_profit <= 0 and total_profit > 0 then 1
                       end as cnt_change
            from (select trade_date,
                         user_id,
                         profit,
                         total_profit,
                         lag(total_profit) over (partition by user_id order by trade_date) as last_total_profit
                  from (select trade_date,
                               user_id,
                               profit,
                               sum(profit) over (partition by user_id order by trade_date asc) as total_profit --累积求和
                        from t_case1_ods_user_profit) t) tt) ttt
      group by trade_date) tttt

执行结果

代码语言:javascript
复制
+-------------+-----------+
| trade_date  | user_cnt  |
+-------------+-----------+
| 2026-03-12  | 1         |
| 2026-03-13  | 1         |
| 2026-03-14  | 3         |
| 2026-03-15  | 3         |
+-------------+-----------+
4 rows selected (1.883 seconds)

分析

SQL我们写完了,即便按照方法三,没有笛卡尔积的方式,每次查询我们都需要不断的查询所有历史数据,进行一遍遍的开窗和聚合,这在生产过程是不可接受的。所以如果日常需要查看该数据、或者经常性的分析,则需要通过建模的方式减少类似的查询计算。

解决思路

如果我们一张表记录了截止到前天所有用户的用户累计盈利金额,叠加昨天用户盈利金额,就可以算出截止到昨天的用户累计盈利金额了。 这种方式叫做滚表的方式,这样加工时每天仅需要处理当天的日期和全量用户前一天的状态,查询是直接根据日期进行统计即可拿到统计结果。 这也是通过存储换计算的方式,也可以理解为空间换时间(存储空间换查询时的查询时间)

创建表

为了方便加工,我们创建两张表 用户每日盈亏表(分区表)(注:该表在实际生产环境,用户每日盈亏表应该就已经是分区表了,无需我们额外处理),用户每日累积盈利金额状态表(分区表)。

代码语言:javascript
复制
CREATE TABLE t_case1_dwd_user_profit (
    user_id     BIGINT        COMMENT '用户ID',
    profit      DECIMAL(10,2) COMMENT '盈利金额'
) 
COMMENT '用户每日盈亏表'
PARTITIONED BY (trade_date STRING COMMENT '交易日期 yyyy-mm-dd');

CREATE TABLE IF NOT EXISTS t_case1_dwd_user_profit_status (
    user_id       BIGINT        COMMENT '用户ID',
    profit        DECIMAL(10,2) COMMENT '当日盈利金额',
    total_profit  DECIMAL(18,2) COMMENT '累计盈利金额'
) 
COMMENT '用户盈利状态明细表'
PARTITIONED BY (stat_date STRING COMMENT '统计日期 yyyy-mm-dd');

生产数据恢复

默认原始日志数据是分区表内数据,我们将数据恢复到分区表中,方便进行下一步处理。(当然对于写SQL来讲,是否做分区无所谓,但是在工作中,一定会有)

代码语言:javascript
复制
insert overwrite table t_case1_dwd_user_profit partition(trade_date)
select 
 user_id,
 profit,
 trade_date
from t_case1_ods_user_profit;
建模

滚表的初始数据、初始加工过程和后面的加工过程不相同,所以需要写两个逻辑,初始逻辑和滚动逻辑。

  • 1.初始数据 因为最早日期是3月12日,把3月12日作为初始数据。
代码语言:javascript
复制
select 
insert overwrite table t_case1_dwd_user_profit_status partition (stat_date)
select user_id,
       profit,
       profit as total_profit
    trade_date as stat_date
from t_case1_dwd_user_profit
where trade_date = '2026-03-12'
  • 2.滚表加工(核心步骤) 13日数据加工
代码语言:javascript
复制
insert overwrite table t_case1_dwd_user_profit_status partition (stat_date = '2026-03-13')
select coalesce(t1.user_id, t2.user_id)                   as user_id,
       t2.profit,
       coalesce(t1.total_profit,0) + coalesce(t2.profit, 0) as total_profit
from (
         --昨日状态数据
         select user_id,
                profit,
                total_profit,
                stat_date
         from t_case1_dwd_user_profit_status
         where stat_date = '2026-03-12') t1
         full outer join
     (select user_id,
             profit,
             trade_date as stat_date
      from t_case1_dwd_user_profit
      where trade_date = '2026-03-13'
) t2
on t1.user_id = t2.user_id

14日加工

代码语言:javascript
复制
insert overwrite table t_case1_dwd_user_profit_status partition (stat_date = '2026-03-14')
select coalesce(t1.user_id, t2.user_id)                   as user_id,
       t2.profit,
       coalesce(t1.total_profit,0) + coalesce(t2.profit, 0) as total_profit
from (
         --昨日状态数据
         select user_id,
                profit,
                total_profit,
                stat_date
         from t_case1_dwd_user_profit_status
         where stat_date = '2026-03-13') t1
         full outer join
     (select user_id,
             profit,
             trade_date as stat_date
      from t_case1_dwd_user_profit
      where trade_date = '2026-03-14'
) t2
on t1.user_id = t2.user_id

15日加工

代码语言:javascript
复制
insert overwrite table t_case1_dwd_user_profit_status partition (stat_date = '2026-03-15')
select coalesce(t1.user_id, t2.user_id)                   as user_id,
       t2.profit,
       coalesce(t1.total_profit,0) + coalesce(t2.profit, 0) as total_profit
from (
         --昨日状态数据
         select user_id,
                profit,
                total_profit,
                stat_date
         from t_case1_dwd_user_profit_status
         where stat_date = '2026-03-14') t1
         full outer join
     (select user_id,
             profit,
             trade_date as stat_date
      from t_case1_dwd_user_profit
      where trade_date = '2026-03-15'
) t2
on t1.user_id = t2.user_id

后续日期根据实际情况加工,生产环境使用参数每日处理即可,最终分区表内的数据如下

结果数据

代码语言:javascript
复制
+----------+---------+---------------+-------------+
| user_id  | profit  | total_profit  |  stat_date  |
+----------+---------+---------------+-------------+
| 1001     | 3.00    | 3.00          | 2026-03-12  |
| 1001     | -5.00   | -2.00         | 2026-03-13  |
| 1002     | 5.00    | 5.00          | 2026-03-13  |
| 1001     | NULL    | -2.00         | 2026-03-14  |
| 1002     | 7.00    | 12.00         | 2026-03-14  |
| 1003     | 7.00    | 7.00          | 2026-03-14  |
| 1004     | 7.00    | 7.00          | 2026-03-14  |
| 1001     | -7.00   | -9.00         | 2026-03-15  |
| 1002     | -7.00   | 5.00          | 2026-03-15  |
| 1003     | NULL    | 7.00          | 2026-03-15  |
| 1004     | NULL    | 7.00          | 2026-03-15  |
+----------+---------+---------------+-------------+
11 rows selected (0.525 seconds)
  • 3.查询使用
代码语言:javascript
复制
select stat_date,
       count(case
                 when total_profit > 0 then user_id end) as user_cnt
from t_case1_dwd_user_profit_status
group by stat_date

查询结果

代码语言:javascript
复制
+-------------+-----------+
|  stat_date  | user_cnt  |
+-------------+-----------+
| 2026-03-12  | 1         |
| 2026-03-13  | 1         |
| 2026-03-14  | 3         |
| 2026-03-15  | 3         |
+-------------+-----------+
4 rows selected (0.725 seconds)
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2026-05-19,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 数据仓库技术 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 背景
    • 1.需求内容
  • SQL 解法
    • 方法一:
    • 方法二:
    • 方法三:
  • 分析
    • 解决思路
      • 创建表
      • 生产数据恢复
      • 建模
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档