我有一个带有日期分区键的Bigquery表。我在该表中获得每日记录,并尝试查找是否有丢失的三年历史数据。因此,我尝试使用以下查询:
SELECT KeyPartitionDate
FROM (
SELECT KeyPartitionDate, DATE(KeyPartitionDate) as day, DATE_ADD(date(KeyPartitionDate), INTERVAL 1 DAY) AS dayplusone
FROM `project.dataset.table`
)
WHERE DATE_DIFF(day, dayplusone , DAY) > 1
GROUP BY KeyPartitionDate
ORDER BY KeyPartitionDate 查询是有效的,但不返回任何结果,而我知道有一些...我的猜测是我搞乱了DATE_ADD函数,但不知道是怎么搞的
发布于 2020-04-15 23:46:11
以下是针对BigQuery标准SQL的说明,仅提供缺少天数的列表
#standardSQL
SELECT day AS missing_days
FROM (
SELECT MIN(KeyPartitionDate) min_day, MAX(KeyPartitionDate) max_day
FROM `project.dataset.table`
), UNNEST(GENERATE_DATE_ARRAY(min_day, max_day)) day
LEFT JOIN (
SELECT DISTINCT KeyPartitionDate AS day
FROM `project.dataset.table`
) t
USING(day)
WHERE t.day IS NULL发布于 2020-04-15 23:21:20
你走错了路:
day = DATE(KeyPartitionDate)然后你就这么做了
dayplusone = DATE_ADD(date(KeyPartitionDate), INTERVAL 1 DAY) 这基本上是说dayplusone = day +(1天)
然后你就可以做到:
WHERE DATE_DIFF(day, dayplusone , DAY) > 1这就像是说: dayplusone - day > (1天),这意味着
day + (1 day) - day > (1 day)你可以清楚地看到为什么这是错误的。
相反,您需要做的是将当前行日期与以前的行日期进行比较。这是使用窗口函数实现的:
SELECT KeyPartitionDate FROM (
SELECT DISTINCT KeyPartitionDate,
LAG(KeyPartitionDate)
OVER (ORDER BY KeyPartitionDate ASC) AS PreviousKeyPartitionDate
FROM `project.dataset.table`)
WHERE DATE_DIFF(DATE(PreviousKeyPartitionDate),DATE(KeyPartitionDate), DAY ) > 1
ORDER BY KeyPartitionDate https://stackoverflow.com/questions/61232058
复制相似问题