我有以下数据,其中我有治疗编号和他们各自的客户。我希望每个处理都有3个客户,并希望关联一个批次ID。例如,对于T0001 3,客户将具有唯一的批次ID。而其余的2个客户将具有另一个批次ID。类似地,对于T0002,将有另一个批次ID,它将只有这2个客户。意味着每个批次的每个治疗编号最多有3个客户。
当前表
Treatment Number Customer ID
T00001 C01
T00001 C02
T00001 C03
T00001 C04
T00001 C05
T00002 C06
T00002 C07
T00004 C09
必需的结果
Treatment Number Customer ID Batch ID
T00001 C01 1
T00001 C02 1
T00001 C03 1
T00001 C04 2
T00001 C05 2
T00002 C06 3
T00002 C07 3
T00004 C09 4
发布于 2018-08-03 02:07:53
我将简单地对row_number()
进行算术运算,以便在每个处理中分配批处理id:
select t.*,
floor( (row_number() over (partition by treatment order by customer) - 1) / 3) as batch_id
from t;
然后,我将使用dense_rank()
将其全局赋值:
select t.*, dense_rank() over (order by treatment, batch_id_within) as batch_id
from (select t.*,
floor( (row_number() over (partition by treatment order by customer) - 1) / 3) as batch_id_within
from t
) t
发布于 2018-08-03 00:35:02
首先,按照要求的顺序(我使用的是customer_id列)使用ROW_NUMBER
为每个处理分配序列号(从0开始)。
对于等于0的每个行号或具有mod(rn,3) = 0
的所有行,将BATCH_START
定义为1。
然后通过将BATCH_ID
相加来简单地计算BATCH_START
。
示例
with cust2 as (
select
TREATMENT_NUBMBER, CUSTOMER_ID,
row_number() over (partition by TREATMENT_NUBMBER order by CUSTOMER_ID)-1 rn
from cust),
cust3 as(
select TREATMENT_NUBMBER, CUSTOMER_ID,RN,
case when mod(rn,3) = 0 then 1 end BATCH_START
from cust2)
select TREATMENT_NUBMBER, CUSTOMER_ID, BATCH_START,
sum(BATCH_START) over (order by TREATMENT_NUBMBER, CUSTOMER_ID) BATCH_ID
from cust3
order by TREATMENT_NUBMBER, CUSTOMER_ID;
TREATM CUS BATCH_START BATCH_ID
------ --- ----------- ----------
T00001 C01 1 1
T00001 C02 1
T00001 C03 1
T00001 C04 1 2
T00001 C04 2
T00002 C04 1 3
T00002 C05 3
https://stackoverflow.com/questions/51658101
复制相似问题