我有一个表,按分钟/小时/天汇总各种产品的销售额,并计算各种指标。
下表列出了根据core_product_tbl
计算的1分钟增量计算。在product_agg_tbl
中完成计算后,其他表在product_agg_tbl
中按小时、天、周等进行计算。
CREATE TABLE product_agg_tbl (
product String,
minute DateTime,
high Nullable(Float32),
low Nullable(Float32),
average AggregateFunction(avg, Nullable(Float32)),
first Nullable(Float32),
last Nullable(Float32),
total_sales Nullable(UInt64)
)
ENGINE = AggregatingMergeTree
PARTITION BY toYYYYMM(minute)
ORDER BY (product, minute);
CREATE MATERIALIZED VIEW product_agg_mv TO product_agg_tbl AS
SELECT
product,
minute,
max(price) AS high,
min(price) AS low,
avgState(price) AS average,
argMin(price, sales_timestamp) AS first,
argMax(price, sales_timestamp) AS last,
sum(batch_size) as total_sales
FROM core_product_tbl
WHERE minute >= today()
GROUP BY product, toStartOfMinute(sales_timestamp) AS minute;
CREATE VIEW product_agg_1w AS
SELECT
product,
toStartOfHour(minute) AS minute,
max(high) AS high,
min(low) AS low,
avgMerge(average) AS average_price,
argMin(first, minute) AS first,
argMax(last, minute) AS last,
sum(total_sales) as total_sales
FROM product_agg_tbl
WHERE minute >= date_sub(today(), interval 7 + 7 day)
GROUP BY product, minute;
我遇到的问题是,当我直接从core_product_tbl
运行下面的查询时,我得到的数字与product_agg_1w
大不相同。这是怎么回事呢?
SELECT
product,
toStartOfHour(minute) AS minute,
max(price) AS high,
min(price) AS low,
avgState(price) AS average,
argMin(price, sales_timestamp) AS first,
argMax(price, sales_timestamp) AS last,
sum(batch_size) as total_sales
FROM core_product_tbl
WHERE minute >= today()
GROUP BY product, toStartOfMinute(sales_timestamp) AS minute;
发布于 2021-10-14 02:32:05
您应该在表AggregatingMergeTree中使用SimpleAggregateFunction或AggregateFunction。
AggregatingMergeTree对实体化视图和实体化视图中的select一无所知。https://den-crane.github.io/Everything_you_should_know_about_materialized_views_commented.pdf
CREATE TABLE product_agg_tbl (
product String,
minute DateTime,
high SimpleAggregateFunction(max, Nullable(Float32)),
low SimpleAggregateFunction(min, Nullable(Float32)),
average AggregateFunction(avg, Nullable(Float32), DateTime),
first AggregateFunction(argMin, Nullable(Float32), DateTime),
last AggregateFunction(argMax, Nullable(Float32),DateTime),
total_sales SimpleAggregateFunction(sum,Nullable(UInt64))
)
ENGINE = AggregatingMergeTree
PARTITION BY toYYYYMM(minute)
ORDER BY (product, minute);
CREATE MATERIALIZED VIEW product_agg_mv TO product_agg_tbl AS
SELECT
product,
minute,
max(price) AS high,
min(price) AS low,
avgState(price) AS average,
argMinState(price, sales_timestamp) AS first,
argMaxState(price, sales_timestamp) AS last,
sum(batch_size) as total_sales
FROM core_product_tbl
WHERE minute >= today()
GROUP BY product, toStartOfMinute(sales_timestamp) AS minute;
CREATE VIEW product_agg_1w AS
SELECT
product,
toStartOfHour(minute) AS minute,
max(high) AS high,
min(low) AS low,
avgMerge(average) AS average_price,
argMinMerge(first, minute) AS first,
argMaxMerge(last, minute) AS last,
sum(total_sales) as total_sales
FROM product_agg_tbl
WHERE minute >= date_sub(today(), interval 7 + 7 day)
GROUP BY product, minute;
不要使用视图(product_agg_1w),因为它会对性能产生反作用。它读取过多的数据。直接使用select to product_agg_tbl。
https://stackoverflow.com/questions/69564279
复制相似问题