第一组查询:
select count(distinct device_ifa) from table_1 where distance_diff <= 50 -- 41504
select count(distinct device_ifa) from table_1 where distance_diff <= 100 -- 69469
select count(distinct device_ifa) from table_1 where distance_diff <= 200 -- 100859
应该与查询的计数完全匹配:
select case when distance_diff <= 50 then '50m'
when distance_diff <= 100 then '100m'
when distance_diff <= 200 then '200m'
end as radius,
count(distinct device_ifa) from table_1
group by 1 order by 1
radius _col1
100m - 62512
200m - 92547
50m - 41504
,但计数不匹配-谁能帮上忙吗?
发布于 2020-03-18 21:33:27
是的,这会给出不同的结果,当你单独运行和案例陈述。个人将给出准确的计数,而案例陈述则会以这种方式给出数据。
选择distance_diff <= 50然后'50m‘时的情况--给出小于50的数据
when distance\_diff <= 100 then '100m' -- will give the data between 50 and 100 as the results were already included as part of 50M when distance\_diff <= 200 then '200m' --this will give between 100 and 200 as < 100 is already part of above statement.
为了获得精确的计数,您可以用这种方式编写case语句。
select
count(distinct case when distance_diff <= 50 then device_ifa end )
Less_than_50, count(distinct case when distance_diff <= 100 then device_ifa end )
Less_than_100 , count(distinct case when distance_diff <= 200 then device_ifa end )
Less_than_200 from table_1
group by 1 order by 1
发布于 2020-03-18 22:14:02
您获得不同计数的原因是由于CASE语句功能。
case when distance_diff <= 50 then '50m'
when distance_diff <= 100 then '100m'
when distance_diff <= 200 then '200m'
end as radius
当您运行上述查询时,数据中存在重叠(例如,<=100条件也包括<=50,<=200类似)。
上述查询的行为如何?:
1.distance_Diff <= 50 :这将将从0到50的所有distance_diff值标记为'50m‘
2.distance_Diff <= 100 :这将将从51到100的所有distance_diff值标记为“100 m”
3.distance_Diff <= 200 :这将将所有distance_diff值从101-200标记为'200m‘。
所以条件1将首先执行。然后,它将检查第二个条件的剩余无标记值,然后执行最后第三个条件。
就像在第一个查询中进行单独计数一样,其中也包括重叠值。
select count(distinct device_ifa) from table_1 where distance_diff <= 50 -- 41504
以上查询计数为distance_diff,从0到50。
select count(distinct device_ifa) from table_1 where distance_diff <= 100 -- 69469
以上查询计数为distance_diff,从0到100。
select count(distinct device_ifa) from table_1 where distance_diff <= 200 -- 100859
上面的查询计数为distance_diff,从0到200。
现在我明白了,为什么结果在这两个查询中是不同的。
发布于 2020-03-19 04:11:39
你似乎在问为什么这些结果会不同。
原因很简单。在第一个查询中,在所有三个组中计算一个25的distance_diff
。它只在第二个< 50
组中计算。
第一个返回一个累积的不同计数。第二个只返回一个范围的不同计数。
如果更改特定范围的单个查询筛选器,则结果应该匹配:
where distance_diff <= 50
where distance_diff > 50 distance_diff <= 100
where distance_diff > 100 distance_diff <= 200
https://stackoverflow.com/questions/60751309
复制相似问题