文章/答案/技术大牛

发布

社区首页 >问答首页 >在Postgres中复制带条件求和的MongoDB $bucket

问在Postgres中复制带条件求和的MongoDB $bucket
EN

Stack Overflow用户

提问于 2021-03-13 23:03:26

回答 1查看 21关注 0票数 0

我有一个包含数十万行的数据库，其模式如下：

+----+----------+---------+
| id | duration | type    |
+----+----------+---------+
| 1  | 41       | cycling |
+----+----------+---------+
| 2  | 15       | walking |
+----+----------+---------+
| 3  | 6        | walking |
+----+----------+---------+
| 4  | 26       | running |
+----+----------+---------+
| 5  | 30       | cycling |
+----+----------+---------+
| 6  | 13       | running |
+----+----------+---------+
| 7  | 10       | running |
+----+----------+---------+

我之前使用MongoDB聚合来完成此操作，并获得按类型和总计数的活动分布：

{
  $bucket: {
    groupBy: '$duration',
    boundaries: [0, 16, 31, 61, 91, 121],
    default: 121,
    output: {
     total: { $sum: 1 },
     walking: {
       $sum: { $cond: [{ $eq: ['$type', 'walking'] }, 1, 0] },
     },
     running: {
       $sum: { $cond: [{ $eq: ['$type', 'running'] }, 1, 0] },
     },
     cycling: {
       $sum: { $cond: [{ $eq: ['$type', 'cycling'] }, 1, 0] },
     },
   },
  },
}

我刚过渡到使用Postgres，不知道如何在那里做条件和。要获得这样的结果表，查询应该是什么？

+---------------+---------+---------+---------+-------+
| duration_band | walking | running | cycling | total |
+---------------+---------+---------+---------+-------+
| 0-15          | 41      | 21      | 12      | 74    |
+---------------+---------+---------+---------+-------+
| 15-30         | 15      | 1       | 44      | 60    |
+---------------+---------+---------+---------+-------+
| 30-60         | 6       | 56      | 7       | 69    |
+---------------+---------+---------+---------+-------+
| 60-90         | 26      | 89      | 32      | 150   |
+---------------+---------+---------+---------+-------+
| 90-120        | 30      | 0       | 6       | 36    |
+---------------+---------+---------+---------+-------+
| 120+          | 13      | 90      | 0       | 103   |
+---------------+---------+---------+---------+-------+
| Total         | 131     | 257     | 101     | 492   |
+---------------+---------+---------+---------+-------+

mongodb

postgresql

回答 1

Stack Overflow用户

发布于 2021-03-15 03:35:50

SQL非常擅长检索和计算数据，并提供数据，因此获取您想要的值是一项简单的任务。它不擅长格式化结果，这就是为什么这项任务通常留给表示层。然而，这并不意味着它不能完成-它可以在single query中完成。难点在于pivot过程-将行转换为列。但首先要做一些设置。您应该将持续时间数据放在自己的表中(如果还没有)。添加了一个标识符，该标识符随后允许多个标准集(稍后将详细介绍)。我将继续这样做。

create table bands( name text, period int4range, title text );  
insert into bands(name, period, title)                         
     values ('Standard',  '[  0, 15)'::int4range ,  '0 - 15')  
          , ('Standard',  '[ 15, 30)'::int4range , '15 - 30')  
          , ('Standard',  '[ 30, 60)'::int4range , '30 - 60')  
          , ('Standard',  '[ 60, 90)'::int4range , '60 - 00')  
          , ('Standard',  '[ 90,120)'::int4range , '90 - 120') 
          , ('Standard',  '[120,)'::int4range    , '120+');

这将设置您当前的标准。name列是前面提到的标识符，其中title列成为输出上的持续时间段。有趣的列是句点；定义为integer range。在这种情况下，[闭合的，开放的]范围包括第一个数字，但不包括第二年，括号有意义。该定义成为结果查询的核心。查询按如下方式构建：

检索所需的间隔集( [0-5) ... )设置并附加“合计”项。
定义活动列表(循环等)。
将这些集合组合在一起，为每个活动创建间隔集列表。

上面给出了活动间隔，它成为旋转时生成的矩阵。

将“测试”表值合并到上面的列表中，计算每个间隔内每个活动的总时间。这是查询的主要任务。它可以完成所有的calculations.

上面的内容现在包含矩阵中每个单元格的间隔和总活动。然而，它仍然存在于行方向上。

将计算出的结果从行方向旋转到列orientation.

Finally，将旋转后的结果压缩为每个间隔的一行，并设置最终的间隔排序。

结果是：

with buckets ( period , title, ord) as                                                   
     ( select period , title, row_number() over (order by lower(b.period)) ord             ---- 1  
         from bands b 
        where name = 'Standard'
       union all
       select  '[0,)','Total',count(*) + 1 
         from bands b 
        where name = 'Standard'
     )
   , activities (activity) as  ( values ('running'),('walking'),('cycling'), ('Total'))   ---- 2  
   , activity_buckets (period, title, ord, activity) as 
     (select * from buckets cross join activities)                                        ---- 3 

select s2.title     "Duration Band"                                                       ---- 6 
     , max(cycling) "Cycling"
     , max(running) "Running"
     , max(walking) "Walking"
     , max(Total)   "Total "
  from ( select s1.title, s1.ord 
              , case when s1.activity = 'cycling' then duration else null end cycling     ---- 5 
              , case when s1.activity = 'running' then duration else null end running     
              , case when s1.activity = 'walking' then duration else null end walking 
              , case when s1.activity = 'Total'   then duration else null end total     
           from ( select ab.ord, ab.title, ab.activity
                       , sum(coalesce(t.duration,0)) duration                             ---- 4 
                    from activity_buckets ab
                    left join test        t
                      on (    (t.type = ab.activity or ab.activity = 'Total')
                          and  t.duration <@ ab.period                         --** determines which time interval(s) the value belongs
                         )  
                  group by ab.ord, ab.title, ab.activity
                ) s1
       ) s2
 group by s2.ord,s2.title      
 order by s2.ord;

参见demo。它包含了整个过程中的每个主要步骤。此外，它还展示了如何使用为间隔创建表。因为我不喜欢长查询，所以我通常将它们隐藏在SQL函数后面，然后直接使用该函数。Demo也包含这个。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/66614951

复制

相似问题

问在Postgres中复制带条件求和的MongoDB $bucket
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在Postgres中复制带条件求和的MongoDB $bucketEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在Postgres中复制带条件求和的MongoDB $bucket
EN