前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >LeetCode面试SQL-用户购买平台

LeetCode面试SQL-用户购买平台

作者头像
数据仓库晨曦
发布2024-09-24 18:46:03
900
发布2024-09-24 18:46:03
举报
文章被收录于专栏:数据仓库技术

一、题目

支出表: t2_spending

代码语言:javascript
复制
+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| user_id     | int     |
| spend_date  | string  |
| platform    | string  | 
| amount      | int     |
+-------------+---------+
  • 这张表记录了用户在一个在线购物网站的支出历史,该在线购物平台同时拥有桌面端('desktop')和手机端('mobile')的应用程序。
  • 这张表的主键是 (user_id, spend_date, platform)。

写一段 SQL 来查找每天 使用手机端用户、 使用桌面端用户和 同时 使用桌面端和手机端的用户人数和总支出金额。

查询结果格式如下例所示: t2_spending table:

代码语言:javascript
复制
+---------+------------+----------+--------+
| user_id | spend_date | platform | amount |
+---------+------------+----------+--------+
| 1       | 2019-07-01 | mobile   | 100    |
| 1       | 2019-07-01 | desktop  | 100    |
| 2       | 2019-07-01 | mobile   | 100    |
| 2       | 2019-07-02 | mobile   | 100    |
| 3       | 2019-07-01 | desktop  | 100    |
| 3       | 2019-07-02 | desktop  | 100    |
+---------+------------+----------+--------+

Result table:

代码语言:javascript
复制
+------------+----------+--------------+-------------+
| spend_date | platform | total_amount | total_users |
+------------+----------+--------------+-------------+
| 2019-07-01 | desktop  | 100          | 1           |
| 2019-07-01 | mobile   | 100          | 1           |
| 2019-07-01 | both     | 200          | 1           |
| 2019-07-02 | desktop  | 100          | 1           |
| 2019-07-02 | mobile   | 100          | 1           |
| 2019-07-02 | both     | 0            | 0           |
+------------+----------+--------------+-------------+ 

在 2019-07-01, 用户1 同时 使用桌面端和手机端购买, 用户2 仅 使用了手机端购买,而用户3 仅 使用了桌面端购买。 在 2019-07-02, 用户2 仅 使用了手机端购买, 用户3 仅 使用了桌面端购买,且没有用户 同时 使用桌面端和手机端购买。

二、分析

本题难点在于:1。对于仅使用一个平台的列出其平台,而对于使用两个平台的需要标记为both;2.对于07-02没有both的,需要进行显示为0的处理。

维度

评分

题目难度

⭐️⭐️⭐️

题目清晰度

⭐️⭐️⭐️⭐️⭐️

业务常见度

⭐️⭐️⭐️

三、SQL

1.区分desktop、mobile、both

由于spark不支持count(distinct )开窗,所以我们先计算一下每个用户每天使用的平台个数。 然后原表与新表关联,计算出每个用户每天使用的new_platform平台类型。

执行SQL

代码语言:javascript
复制
select t1.spend_date,
       t1.user_id,
       t1.platform,
       t1.amount,
       t2.plat_cnt,
       if(t2.plat_cnt = 1, t1.platform, 'both') as new_platform
from t2_spending t1
         left join
     (select spend_date,
             user_id,
             count(distinct platform) as plat_cnt
      from t2_spending
      group by spend_date,
               user_id) t2
     on t1.spend_date = t2.spend_date
         and t1.user_id = t2.user_id

SQL结果

代码语言:javascript
复制
+-------------+----------+-----------+---------+-----------+---------------+
| spend_date  | user_id  | platform  | amount  | plat_cnt  | new_platform  |
+-------------+----------+-----------+---------+-----------+---------------+
| 2019-01-01  | 1        | mobile    | 100     | 2         | both          |
| 2019-01-01  | 1        | desktop   | 100     | 2         | both          |
| 2019-01-01  | 2        | mobile    | 100     | 1         | mobile        |
| 2019-01-02  | 2        | mobile    | 100     | 1         | mobile        |
| 2019-01-01  | 3        | desktop   | 100     | 1         | desktop       |
| 2019-01-02  | 3        | desktop   | 100     | 1         | desktop       |
+-------------+----------+-----------+---------+-----------+---------------+

2.使用new_plateform 计算各项指标

我们使用新的new_platform来计算各项结果

执行SQL

代码语言:javascript
复制
select spend_date,
       new_platform,
       sum(amount)             as total_amount,
       count(distinct user_id) as total_users
from (select t1.spend_date,
             t1.user_id,
             t1.platform,
             t1.amount,
             t2.plat_cnt,
             if(t2.plat_cnt = 1, t1.platform, 'both') as new_platform
      from t2_spending t1
               left join
           (select spend_date,
                   user_id,
                   count(distinct platform) as plat_cnt
            from t2_spending
            group by spend_date,
                     user_id) t2
           on t1.spend_date = t2.spend_date
               and t1.user_id = t2.user_id) tt
group by spend_date,
         new_platform
order by 1 asc

SQL结果

代码语言:javascript
复制
+-------------+---------------+---------------+--------------+
| spend_date  | new_platform  | total_amount  | total_users  |
+-------------+---------------+---------------+--------------+
| 2019-01-01  | mobile        | 100           | 1            |
| 2019-01-01  | desktop       | 100           | 1            |
| 2019-01-01  | both          | 200           | 1            |
| 2019-01-02  | desktop       | 100           | 1            |
| 2019-01-02  | mobile        | 100           | 1            |
+-------------+---------------+---------------+--------------+

3.补充维表得到最终结果

需要每天展示不同平台的全部数据,所以我们先创建一张包含mobile,desktop,both类型与全量日期组合的数据表

执行SQL

代码语言:javascript
复制
select ttt1.spend_date,
       ttt1.platform,
       coalesce(total_amount, 0) as total_amount,
       coalesce(total_users, 0)  as total_users
from (
--生成日期+平台类型的全量的数据
         select spend_date, 'desktop' as platform
         from t2_spending
         group by spend_date
         union all
         select spend_date, 'mobile' as platform
         from t2_spending
         group by spend_date
         union all
         select spend_date, 'both' as platform
         from t2_spending
         group by spend_date) ttt1
         left join
     (select spend_date,
             new_platform,
             sum(amount)             as total_amount,
             count(distinct user_id) as total_users
      from (select t1.spend_date,
                   t1.user_id,
                   t1.platform,
                   t1.amount,
                   t2.plat_cnt,
                   if(t2.plat_cnt = 1, t1.platform, 'both') as new_platform
            from t2_spending t1
                     left join
                 (select spend_date,
                         user_id,
                         count(distinct platform) as plat_cnt
                  from t2_spending
                  group by spend_date,
                           user_id) t2
                 on t1.spend_date = t2.spend_date
                     and t1.user_id = t2.user_id) tt
      group by spend_date,
               new_platform) ttt2
     on ttt1.platform = ttt2.new_platform
         and ttt1.spend_date = ttt2.spend_date
order by 1 asc

SQL结果

代码语言:javascript
复制
+-------------+-----------+---------------+--------------+
| spend_date  | platform  | total_amount  | total_users  |
+-------------+-----------+---------------+--------------+
| 2019-01-01  | desktop   | 100           | 1            |
| 2019-01-01  | mobile    | 100           | 1            |
| 2019-01-01  | both      | 200           | 1            |
| 2019-01-02  | desktop   | 100           | 1            |
| 2019-01-02  | mobile    | 100           | 1            |
| 2019-01-02  | both      | 0             | 0            |
+-------------+-----------+---------------+--------------+

四、建表语句和数据插入

代码语言:javascript
复制
--建表语句
CREATE TABLE t2_spending(
user_id int,
spend_date string,
platform string,
amount int
) COMMENT '支出表'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
;
-- 插入数据
insert into t2_spending(user_id,spend_date,platform,amount)
values
(1,'2019-01-01','mobile',100),
(1,'2019-01-01','desktop',100),
(2,'2019-01-01','mobile',100),
(2,'2019-01-02','mobile',100),
(3,'2019-01-01','desktop',100),
(3,'2019-01-02','desktop',100);
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2024-09-24,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 数据仓库技术 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 一、题目
  • 二、分析
  • 三、SQL
    • 1.区分desktop、mobile、both
      • 2.使用new_plateform 计算各项指标
        • 3.补充维表得到最终结果
        • 四、建表语句和数据插入
        领券
        问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档