首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >AggregatingMergeTree未正确聚合插入

AggregatingMergeTree未正确聚合插入
EN

Stack Overflow用户
提问于 2021-10-14 02:05:31
回答 1查看 70关注 0票数 1

我有一个表,按分钟/小时/天汇总各种产品的销售额,并计算各种指标。

下表列出了根据core_product_tbl计算的1分钟增量计算。在product_agg_tbl中完成计算后,其他表在product_agg_tbl中按小时、天、周等进行计算。

代码语言:javascript
运行
复制
CREATE TABLE product_agg_tbl (
  product String,
  minute DateTime,
  high Nullable(Float32),
  low Nullable(Float32),
  average AggregateFunction(avg, Nullable(Float32)),
  first Nullable(Float32),
  last Nullable(Float32),
  total_sales Nullable(UInt64)
)
ENGINE = AggregatingMergeTree
PARTITION BY toYYYYMM(minute)
ORDER BY (product, minute);

CREATE MATERIALIZED VIEW product_agg_mv TO product_agg_tbl AS
SELECT 
  product,
  minute,
  max(price) AS high,
  min(price) AS low,
  avgState(price) AS average,
  argMin(price, sales_timestamp) AS first,
  argMax(price, sales_timestamp) AS last,
  sum(batch_size) as total_sales
FROM  core_product_tbl
WHERE minute >= today()
GROUP BY product, toStartOfMinute(sales_timestamp) AS minute;

CREATE VIEW product_agg_1w AS
SELECT
    product,
    toStartOfHour(minute) AS minute,
    max(high) AS high,
    min(low) AS low,
    avgMerge(average) AS average_price,
    argMin(first, minute) AS first,
    argMax(last, minute) AS last,
    sum(total_sales) as total_sales
FROM product_agg_tbl
WHERE minute >= date_sub(today(), interval 7 + 7 day)
GROUP BY  product, minute;

我遇到的问题是,当我直接从core_product_tbl运行下面的查询时,我得到的数字与product_agg_1w大不相同。这是怎么回事呢?

代码语言:javascript
运行
复制
SELECT 
  product,
  toStartOfHour(minute) AS minute,
  max(price) AS high,
  min(price) AS low,
  avgState(price) AS average,
  argMin(price, sales_timestamp) AS first,
  argMax(price, sales_timestamp) AS last,
  sum(batch_size) as total_sales
FROM  core_product_tbl
WHERE minute >= today()
GROUP BY product, toStartOfMinute(sales_timestamp) AS minute;
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-10-14 02:32:05

您应该在表AggregatingMergeTree中使用SimpleAggregateFunction或AggregateFunction。

AggregatingMergeTree对实体化视图和实体化视图中的select一无所知。https://den-crane.github.io/Everything_you_should_know_about_materialized_views_commented.pdf

代码语言:javascript
运行
复制
CREATE TABLE product_agg_tbl (
  product String,
  minute DateTime,
  high SimpleAggregateFunction(max, Nullable(Float32)),
  low SimpleAggregateFunction(min, Nullable(Float32)),
  average AggregateFunction(avg, Nullable(Float32), DateTime),
  first AggregateFunction(argMin, Nullable(Float32), DateTime),
  last AggregateFunction(argMax, Nullable(Float32),DateTime),
  total_sales SimpleAggregateFunction(sum,Nullable(UInt64))
)
ENGINE = AggregatingMergeTree
PARTITION BY toYYYYMM(minute)
ORDER BY (product, minute);

CREATE MATERIALIZED VIEW product_agg_mv TO product_agg_tbl AS
SELECT 
  product,
  minute,
  max(price) AS high,
  min(price) AS low,
  avgState(price) AS average,
  argMinState(price, sales_timestamp) AS first,
  argMaxState(price, sales_timestamp) AS last,
  sum(batch_size) as total_sales
FROM  core_product_tbl
WHERE minute >= today()
GROUP BY product, toStartOfMinute(sales_timestamp) AS minute;

CREATE VIEW product_agg_1w AS
SELECT
    product,
    toStartOfHour(minute) AS minute,
    max(high) AS high,
    min(low) AS low,
    avgMerge(average) AS average_price,
    argMinMerge(first, minute) AS first,
    argMaxMerge(last, minute) AS last,
    sum(total_sales) as total_sales
FROM product_agg_tbl
WHERE minute >= date_sub(today(), interval 7 + 7 day)
GROUP BY  product, minute;

不要使用视图(product_agg_1w),因为它会对性能产生反作用。它读取过多的数据。直接使用select to product_agg_tbl。

票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69564279

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档