(1).前置条件:
有以下数据集c和e,表查询结果如下图所示。
数据集c表为:test_medicine_buy
数据集e表为:test_medicine_sold
test_medicine_buy表数据:
test_medicine_sold表数据:
(2).题目要求:
如上图的数据集c和e,某医院系统卖给患者阿莫西林X数量后去下购进记录的库存,购进记录可能是零散的。
现在按 iilszh 排序,优先 iilszh 小的,依次下库存如何得到类似下面预期结果的结果集。
(1). 子查询a, 按照 spmc 药品名称进行分组,按 iilszh 正序排列,求得至当前记录的累计 gjsl
查询结果如下:
iilszh spmc gjsl cum_gjsl
1 阿莫西林胶囊 20 20
2 阿莫西林胶囊 50 70
3 阿莫西林胶囊 40 110
4 阿莫西林胶囊 30 140
5 人血白蛋白 120 120
6 人血白蛋白 80 200
7 人血白蛋白 100 300
(2). 子查询c 为子查询a 与数据集e 在 join 后的结果
查询结果如下:
iilszh spmc gjsl cum_gjsl fysl
1 阿莫西林胶囊 20 20 80
2 阿莫西林胶囊 50 70 80
3 阿莫西林胶囊 40 110 80
4 阿莫西林胶囊 30 140 80
5 人血白蛋白 120 120 100
6 人血白蛋白 80 200 100
7 人血白蛋白 100 300 100
(3). 子查询d,每行累计购进数量总和减去总发药数量,求得diff_sl
查询结果如下:
iilszh spmc gjsl cum_gjsl fysl diff_sl
1 阿莫西林胶囊 20 20 80 -60
2 阿莫西林胶囊 50 70 80 -10
3 阿莫西林胶囊 40 110 80 30
4 阿莫西林胶囊 30 140 80 60
5 人血白蛋白 120 120 100 20
6 人血白蛋白 80 200 100 100
7 人血白蛋白 100 300 100 200
(4). 针对子查询d 的查询结果作 case when 判断,最后得出满足需求的结果集,如预期结果栏数据
适用于无需创建物理表的情况下
使用数据集c、e替代物理表:test_medicine_buy和test_medicine_sold,直接复制以下 hive sql 语句,可以在 Apache Hive 环境直接运行,得到以上预期结果数据。
SELECT spmc AS medicine_name,
gjsl AS buy_amount,
(gjsl - (CASE WHEN diff_sl < 0 THEN 0
WHEN diff_sl > gjsl THEN gjsl
ELSE diff_sl END)) AS this_cut_amount,
CASE WHEN diff_sl < 0 THEN 0
WHEN diff_sl > gjsl THEN gjsl
ELSE diff_sl END left_amount
FROM(
SELECT *,(cum_gjsl - fysl) AS diff_sl
FROM (
SELECT a.*,b.fysl
FROM (SELECT *,SUM(gjsl) OVER (PARTITION BY spmc ORDER BY iilszh) AS cum_gjsl
FROM ( select 1 iilszh,'阿莫西林胶囊' spmc,20 gjsl
union all
select 2 iilszh,'阿莫西林胶囊' spmc,50 gjsl
union all
select 3 iilszh,'阿莫西林胶囊' spmc,40 gjsl
union all
select 4 iilszh,'阿莫西林胶囊' spmc,30 gjsl
union all
select 5 iilszh,'人血白蛋白' spmc,120 gjsl
union all
select 6 iilszh,'人血白蛋白' spmc,80 gjsl
union all
select 7 iilszh,'人血白蛋白' spmc,100 gjsl
)c ORDER BY iilszh
) a
JOIN (SELECT spmc, fysl
FROM ( select 1 iilszh,'阿莫西林胶囊' spmc,80 fysl
union all
select 2 iilszh, '人血白蛋白' spmc,100 fysl
)e
) b ON a.spmc = b.spmc
) c ORDER BY iilszh, diff_sl) d
适用于需先创建物理表:test_medicine_buy 和 test_medicine_sold 的情况下
将测试数据 insert 至 test_medicine_buy 和 test_medicine_sold 表。
表数据结果同前置条件中的 test_medicine_buy 和 test_medicine_sold 表数据。
将解题方式一中的数据集c,e 替换成表 test_medicine_buy 和test_medicine_sold 即可。
SELECT spmc AS medicine_name,
gjsl AS buy_amount,
(gjsl - (CASE WHEN diff_sl < 0 THEN 0
WHEN diff_sl > gjsl THEN gjsl
ELSE diff_sl END)) AS this_cut_amount,
CASE WHEN diff_sl < 0 THEN 0
WHEN diff_sl > gjsl THEN gjsl
ELSE diff_sl END left_amount
FROM(
SELECT *,(cum_gjsl - fysl) AS diff_sl
FROM (
SELECT a.*,b.fysl
FROM (SELECT *,SUM(gjsl) OVER (PARTITION BY spmc ORDER BY iilszh) AS cum_gjsl
FROM test_medicine_buy c ORDER BY iilszh
) a
JOIN (SELECT spmc, fysl
FROM test_medicine_sold e
) b ON a.spmc = b.spmc
) c ORDER BY iilszh, diff_sl) d
使用 hive sql 完成这道 Sql 题,所使用到的函数或方法如下:
SUM(gjsl) OVER (PARTITION BY spmc ORDER BY iilszh) AS cum_gjsl
PARTITION BY spmc ORDER BY iilszh:
使用spmc分组, iilszh字段正序排列后求gisl的总和;
(cum_gjsl - fysl) AS diff_sl:
通过spmc分组后的购进总量减去已发药数据,求取差值,作为上面case when的判断条件;
计算本次应下的库存
(gjsl - (CASE WHEN diff_sl < 0 THEN 0
WHEN diff_sl > gjsl THEN gjsl
ELSE diff_sl END)) AS this_cut_amount
计算本次剩余库存
CASE WHEN diff_sl < 0 THEN 0
WHEN diff_sl > gjsl THEN gjsl
ELSE diff_sl END left_amount