首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >如何在google BigQuery中填充缺失的日期?

如何在google BigQuery中填充缺失的日期?
EN

Stack Overflow用户
提问于 2019-06-22 23:35:23
回答 2查看 818关注 0票数 0

我想写一个图表来显示firebase中的活跃用户

我写了这段代码

SELECT event_date, COUNT(DISTINCT user_pseudo_id) AS user_count
FROM `mark-3314e.analytics_197261162.events_*`  
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)) AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
AND event_name = 'session_start'
GROUP BY event_date
ORDER BY event_date ASC

这是我的回答

Row event_date  user_count  
1   20190617        1
2   20190621        3

有没有办法用之前的数据填补21到17之间缺失的日期?像这样:

event_date  user_count  
20190617        1
20190618        1
20190619        1
20190620        1
20190621        3
EN

回答 2

Stack Overflow用户

发布于 2019-06-22 23:39:22

您可以使用包含感兴趣的完整日期范围的日历表进行连接:

WITH dates AS (
    SELECT '20190617' AS dt UNION ALL
    SELECT '20190618' UNION ALL
    SELECT '20190619' UNION ALL
    SELECT '20190620' UNION ALL
    SELECT '20190621'
)

SELECT
    t1.dt AS event_date,
    COUNT(DISTINCT t2.user_pseudo_id) AS user_count
FROM dates t1
LEFT JOIN `mark-3314e.analytics_197261162.events_*` t2
    ON t1.dt = t2.event_date AND
       t2._TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)) AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
       AND t2.event_name = 'session_start'
GROUP BY
    t1.dt
ORDER BY
    t1.dt;

有关在BigQuery中生成日期范围的更通用方法,请参见see this SO question

票数 1
EN

Stack Overflow用户

发布于 2019-06-23 09:55:57

这是在BigQuery中使用GENERATE_DATE_ARRAY函数的一个可能的解决方案。

with data as (
   SELECT parse_date('%Y%m%d', event_date) AS event_date, COUNT(DISTINCT user_pseudo_id) AS user_count
   FROM `mark-3314e.analytics_197261162.events_*`  
   WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)) AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
   AND event_name = 'session_start'
   GROUP BY event_date
   ORDER BY event_date ASC
)

select dt as event_date, user_count from (
  select user_count,
      if(
        previousdate is null, 
        generate_date_array(date, date_sub(nextdate, interval 1 day), interval 1 day), 
        generate_date_array(date, if(nextdate is null, date, date_sub(nextdate, interval 1 day)), interval 1 day)
      ) as dates 
  from (
          select 
            lag(event_date) over(order by event_date) as previousdate,
            event_date as date,
            lead(event_date) over(order by event_date) as nextdate,
            user_count
          from data
      )
), unnest(dates) dt
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/56716782

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档