对于大日期过滤器,我有一个非常快速的查询
EXPLAIN ANALYZE
SELECT "advertisings"."id",
"advertisings"."page_id",
"advertisings"."page_name",
"advertisings"."created_at",
"posts"."image_url",
"posts"."thumbnail_url",
"posts"."post_content",
"posts"."like_count"
FROM "advertisings"
INNER JOIN "posts" ON "advertisings"."post_id" = "posts"."id"
WHERE "advertisings"."created_at" >= '2020-01-01T00:00:00Z'
AND "advertisings"."created_at" < '2020-12-02T23:59:59Z'
ORDER BY "like_count" DESC LIMIT 20查询计划是:
Limit (cost=0.85..20.13 rows=20 width=552) (actual time=0.026..0.173 rows=20 loops=1)
-> Nested Loop (cost=0.85..951662.55 rows=987279 width=552) (actual time=0.025..0.169 rows=20 loops=1)
-> Index Scan using posts_like_count_idx on posts (cost=0.43..378991.65 rows=1053015 width=504) (actual time=0.013..0.039 rows=20 loops=1)
-> Index Scan using advertisings_post_id_index on advertisings (cost=0.43..0.53 rows=1 width=52) (actual time=0.005..0.006 rows=1 loops=20)
Index Cond: (post_id = posts.id)
Filter: ((created_at >= '2020-01-01 00:00:00'::timestamp without time zone) AND (created_at < '2020-12-02 23:59:59'::timestamp without time zone))
Planning Time: 0.365 ms
Execution Time: 0.199 ms但是,当我缩小筛选器(更改"created_at“>= '2020-11-25T00:00:00Z')返回9个记录(低于限制20)时,查询非常缓慢。
EXPLAIN ANALYZE
SELECT "advertisings"."id",
"advertisings"."page_id",
"advertisings"."page_name",
"advertisings"."created_at",
"posts"."image_url",
"posts"."thumbnail_url",
"posts"."post_content",
"posts"."like_count"
FROM "advertisings"
INNER JOIN "posts" ON "advertisings"."post_id" = "posts"."id"
WHERE "advertisings"."created_at" >= '2020-11-25T00:00:00Z'
AND "advertisings"."created_at" < '2020-12-02T23:59:59Z'
ORDER BY "like_count" DESC LIMIT 20查询计划:
Limit (cost=1000.88..8051.73 rows=20 width=552) (actual time=218.485..4155.336 rows=9 loops=1)
-> Gather Merge (cost=1000.88..612662.09 rows=1735 width=552) (actual time=218.483..4155.328 rows=9 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop (cost=0.85..611461.80 rows=723 width=552) (actual time=118.170..3786.176 rows=3 loops=3)
-> Parallel Index Scan using posts_like_count_idx on posts (cost=0.43..372849.07 rows=438756 width=504) (actual time=0.024..1542.094 rows=351005 loops=3)
-> Index Scan using advertisings_post_id_index on advertisings (cost=0.43..0.53 rows=1 width=52) (actual time=0.006..0.006 rows=0 loops=1053015)
Index Cond: (post_id = posts.id)
Filter: ((created_at >= '2020-11-25 00:00:00'::timestamp without time zone) AND (created_at < '2020-12-02 23:59:59'::timestamp without time zone))
Rows Removed by Filter: 1
Planning Time: 0.394 ms
Execution Time: 4155.379 ms我花了几个小时在谷歌上搜索,但找不到正确的解决方案。我们会非常感谢你的帮助。
更新的
当我继续缩小过滤器
WHERE "advertisings"."created_at" >= '2020-11-27T00:00:00Z'
AND "advertisings"."created_at" < '2020-12-02T23:59:59Z'它还将9条记录作为上述查询的慢速返回。然而,这一次,查询的速度确实很快。
Limit (cost=8082.99..8083.04 rows=20 width=552) (actual time=0.062..0.065 rows=9 loops=1)
-> Sort (cost=8082.99..8085.40 rows=962 width=552) (actual time=0.061..0.062 rows=9 loops=1)
Sort Key: posts.like_count DESC
Sort Method: quicksort Memory: 32kB
-> Nested Loop (cost=0.85..8057.39 rows=962 width=552) (actual time=0.019..0.047 rows=9 loops=1)
-> Index Scan using advertisings_created_at_index on advertisings (cost=0.43..501.30 rows=962 width=52) (actual time=0.008..0.012 rows=9 loops=1)
Index Cond: ((created_at >= '2020-11-27 00:00:00'::timestamp without time zone) AND (created_at < '2020-12-02 23:59:59'::timestamp without time zone))
-> Index Scan using posts_pkey on posts (cost=0.43..7.85 rows=1 width=504) (actual time=0.003..0.003 rows=1 loops=9)
Index Cond: (id = advertisings.post_id)
Planning Time: 0.540 ms
Execution Time: 0.096 ms我不知道会发生什么
发布于 2020-12-03 03:41:16
在前两个查询和最后一个查询中,PostgreSQL遵循两种不同的策略:
advertisings行,则使用嵌套的循环联接来按照ORDER BY子句的顺序提取行,并丢弃在找到20.之前与条件不匹配的行。
advertisings行很少,它会获取这几行,然后是posts中的匹配行,然后排序并接受前20行。第二次执行很慢,因为PostgreSQL高估了与条件匹配的advertisings中的行。看看它是如何估计962而不是第三个查询中的9的吗?
解决方案是改进PostgreSQL的估计:
分析广告;
这足以使缓慢的查询速度更快,让PostgreSQL更经常地收集统计数据:
更改表广告集(autovacuum_analyze_scale_factor =0.0 5);
如果这还不够,请尝试收集更详细的统计数据:
设定default_statistics_target =1000,分析广告;
你可以用高达10000的值进行实验。找到起作用的值后,将其持久化:
更改表广告更改created_at集统计信息1000;
https://stackoverflow.com/questions/65118809
复制相似问题