我有一个数据集,其中我需要保持相同的粒度,但我需要根据一组条件修复一些行
当这些客户从“活动”转换到“已取消”时,我需要在我期望的输出为"DATE_NEW“之后的每一行中保存我在每个转换中看到的第一个取消日期-您可以看到该日期与您从A -> C转换为状态时看到的第一个日期相同。
示例:
row_number,Customer,Status, Date, DATE_NEW
1,John,"A","3000-12-31","3000-12-31"
2,John,"C","2019-01-01","2019-01-01"
3,John,"A","3000-12-31","3000-12-31",
4,John,"C","2019-05-01","2019-05-01"
5,John,"C","2019-07-31","2019-05-01"
6,Eve,"A","3000-12-31","3000-12-31"
7,Eve,"C","2019-06-01","2019-06-01"
8,Eve,"C","2019-03-01","2019-06-01"
9,Eve,"C","2019-03-02","2019-06-01"
发布于 2019-08-07 12:34:02
如果您的数据库是Postgres、Oracle、Db2、Firebird或MySQL之一。FIRST_VALUE()返回一组有序值中的第一个值。此链接将对您有所帮助:https://docs.microsoft.com/en-us/sql/t-sql/functions/first-value-transact-sql?view=sql-server-2017
发布于 2019-08-07 13:13:47
我的印象是将此视为一个鸿沟和岛屿问题。岛是通过对前面的A行进行计数来确定的,因此每个C行都可以与一个转换相关联。行编号后,只需找到最早的日期即可。
with data as (
select *,
count(case when Status = 'A' then 1 end)
over (partition by Customer order by "row_number") as acnt
from T
)
select *,
min("Date") over (partition by Customer, Status, acnt) as date_new
from data
order by "row_number";
"row_number“是实际的数据列吗?如果不是,您将替换为另一个排名列。
发布于 2019-08-07 20:10:34
肖恩的方法是正确的,但逻辑并不完全正确。它将为"A“行设置错误的日期。
所以:
with cte as (
select t.*,
sum(case when Status = 'A' then 1 else 0 end
) over (partition by Customer order by Date) as acnt
from t
)
select t.*,
(case when status = 'A' then date
else min(Date) over (partition by Customer, status, acnt)
end) as date_new
from cte;
https://stackoverflow.com/questions/57386762
复制相似问题