现有一份数据记录了用户进入和离开游泳池的时间,请找出一天中泳池最多时有几个人
--样例数据
1 enter 2024-04-07 08:01:00
2 enter 2024-04-07 08:20:00
1 leave 2024-04-07 09:30:00
3 enter 2024-04-07 10:01:00
4 enter 2024-04-07 10:25:00
2 leave 2024-04-07 11:22:00
4 leave 2024-04-07 12:20:00
5 enter 2024-04-07 12:30:00
6 enter 2024-04-07 13:05:00
3 leave 2024-04-07 14:20:00
5 leave 2024-04-07 14:30:00
7 enter 2024-04-07 14:45:00
8 enter 2024-04-07 15:18:00
7 leave 2024-04-07 15:32:00
6 leave 2024-04-07 15:45:00
9 enter 2024-04-07 16:01:00
10 enter 2024-04-07 16:10:00
9 leave 2024-04-07 16:25:00
8 leave 2024-04-07 16:35:00
10 leave 2024-04-07 16:55:00
这个题目是根据日志内容,计算时点状态数据,类似的还有根据日志计算直播间用户人数、游戏同时在线人数等。考察内容为累积求和及其使用技巧,属于知识面和经验的考察,如果知道则十分简单。
维度 | 评分 |
---|---|
题目难度 | ⭐️⭐️⭐️⭐️ |
题目清晰度 | ⭐️⭐️⭐️⭐️⭐ |
业务常见度 | ⭐️⭐️⭐️⭐️ |
1)根据进入和离开的状态,计算泳池人数的变化,进入记为1,离开记为-1
select
user_id,
log_type,
log_time,
if(log_type ='enter',1,-1) as enter_cnt
from t_user_pool_log
查询结果
2)对enter_cnt进行累积求和,即对sum()函数进行开窗,然后开窗内根据时间进行排序。
select
user_id,
log_type,
log_time,
enter_cnt,
sum(enter_cnt)over(order by log_time asc) as pool_user_cnt
from
(
select
user_id,
log_type,
log_time,
if(log_type ='enter',1,-1) as enter_cnt
from t_user_pool_log
) t
查询结果
3)对泳池内的用户数计算最大值,则得出当天泳池内的最大人数。
select
max(pool_user_cnt)
from
(
select
user_id,
log_type,
log_time,
enter_cnt,
sum(enter_cnt)over(order by log_time asc) as pool_user_cnt
from
(
select
user_id,
log_type,
log_time,
if(log_type ='enter',1,-1) as enter_cnt
from t_user_pool_log
) t
)tt
查询结果
--建表语句
CREATE TABLE t_user_pool_log (
user_id INT,
log_type STRING,
log_time STRING
)
COMMENT '泳池进出日志表'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
-- 样例数据插入
insert into t_user_pool_log values
(1,'enter','2024-04-07 08:01:00'),
(2,'enter','2024-04-07 08:20:00'),
(1,'leave','2024-04-07 09:30:00'),
(3,'enter','2024-04-07 10:01:00'),
(4,'enter','2024-04-07 10:25:00'),
(2,'leave','2024-04-07 11:22:00'),
(4,'leave','2024-04-07 12:20:00'),
(5,'enter','2024-04-07 12:30:00'),
(6,'enter','2024-04-07 13:05:00'),
(3,'leave','2024-04-07 14:20:00'),
(5,'leave','2024-04-07 14:30:00'),
(7,'enter','2024-04-07 14:45:00'),
(8,'enter','2024-04-07 15:18:00'),
(7,'leave','2024-04-07 15:32:00'),
(6,'leave','2024-04-07 15:45:00'),
(9,'enter','2024-04-07 16:01:00'),
(10,'enter','2024-04-07 16:10:00'),
(9,'leave','2024-04-07 16:25:00'),
(8,'leave','2024-04-07 16:35:00'),
(10,'leave','2024-04-07 16:55:00');