我正在集思广益地探讨如何在包含一年交易量的数据集上找到趋势。
我想平均运行25%的数据观测和75%的数据观测,反之亦然。
如果整个数据集包含1000个观测,我想运行:
this.
的平均值。
对于我拥有的总体平均值:avg(transaction_amount)
我知道,为了使分段平均值有用,我必须按照我在SQL代码中已经说明的日期对数据进行排序:
select avg(transaction_amount)
from example.table
order by transaction_date
我现在很难找到一种方法,根据观察的数量,将数据分成25%到75%。
谢谢。
发布于 2022-04-13 17:54:00
如果您使用的是MSSQL,那么取决于您所要寻找的输出,这是非常简单的。
SELECT TOP 25 PERCENT
*
FROM (
SELECT
AVG(transaction_amount) as avg_amt
FROM example.table
) AS sub
ORDER BY sub.avg_amt DESC
发布于 2022-04-13 18:12:19
使用PERCENT_RANK
来查看一个行属于哪个百分比块。然后使用它对数据进行分组:
with data as
(
select t.*, percent_rank() over (order by transaction_amount) as pr
from example.table t
)
select
case when pr <= 0.75 then '0-75%' else '75-100%' end as percent,
avg(transaction_amount) as avg,
avg(avg(transaction_amount)) over () as avg_of_avg
from data
group by case when pr <= 0.75 then '0-75%' else '75-100%' end
union all
select
case when pr <= 0.25 then '0-25%' else '25-100%' end as percent,
avg(transaction_amount) as avg,
avg(avg(transaction_amount)) over () as avg_of_avg
from data
case when pr <= 0.25 then '0-25%' else '25-100%' end;
https://stackoverflow.com/questions/71861813
复制相似问题