我必须在Postgres 9.4.8上运行这个查询,它非常慢,可能是因为page_views表包含大约2亿条记录。
以下是查询:
EXPLAIN ANALYZE
SELECT COUNT(*) AS count_all,
date(created_at) AS date_created_at
FROM "page_views"
WHERE "page_views"."feed_id" = 20142
AND (created_at between '2016-07-03 18:18:19.665009' and '2016-08-02 18:18:23.144754')
GROUP BY date(created_at);ANALYZE EXPLAIN的结果是:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=30491.20..30492.43 rows=353 width=8) (actual time=294668.397..294668.403 rows=30 loops=1)
Group Key: date(created_at)
-> Bitmap Heap Scan on page_views (cost=29499.12..30490.70 rows=498 width=8) (actual time=292960.060..294667.729 rows=1011 loops=1)
Recheck Cond: ((feed_id = 20142) AND (created_at >= '2016-07-03 18:18:19.665009'::timestamp without time zone) AND (created_at <= '2016-08-02 18:18:23.144754'::timestamp without time zone))
Heap Blocks: exact=998
-> BitmapAnd (cost=29499.12..29499.12 rows=498 width=0) (actual time=292959.413..292959.413 rows=0 loops=1)
-> Bitmap Index Scan on index_page_views_on_feed_id (cost=0.00..245.12 rows=30003 width=0) (actual time=0.312..0.312 rows=1839 loops=1)
Index Cond: (feed_id = 20142)
-> Bitmap Index Scan on index_page_views_on_created_at (cost=0.00..29253.90 rows=3516895 width=0) (actual time=292956.885..292956.885 rows=26220273 loops=1)
Index Cond: ((created_at >= '2016-07-03 18:18:19.665009'::timestamp without time zone) AND (created_at <= '2016-08-02 18:18:23.144754'::timestamp without time zone))
Planning time: 0.144 ms
Execution time: 294668.450 ms这是表的定义。
\d+ page_views;
Table "public.page_views"
Column | Type | Modifiers | Storage | Stats target | Description
------------+-----------------------------+---------------------------------------------------------+----------+--------------+-------------
id | integer | not null default nextval('page_views_id_seq'::regclass) | plain | |
feed_id | integer | | plain | |
url | text | | extended | |
created_at | timestamp without time zone | | plain | |
updated_at | timestamp without time zone | | plain | |
Indexes:
"page_views_pkey" PRIMARY KEY, btree (id)
"index_page_views_on_created_at" btree (created_at DESC)
"index_page_views_on_feed_id" btree (feed_id)
Has OIDs: no关于如何加快速度,有什么建议吗?
我运行了VACUUM ANALYZE page_views;,花费了很长时间,然后再次运行ANALYZE EXPLAIN:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=44520.59..44532.91 rows=3520 width=8) (actual time=14315.883..14315.893 rows=30 loops=1)
Group Key: date(created_at)
-> Index Scan using index_page_views_on_feed_id on page_views (cost=0.11..44517.07 rows=3520 width=8) (actual time=160.623..14314.562 rows=1011 loops=1)
Index Cond: (feed_id = 20142)
Filter: ((created_at >= '2016-07-03 18:18:19.665009'::timestamp without time zone) AND (created_at <= '2016-08-02 18:18:23.144754'::timestamp without time zone))
Rows Removed by Filter: 832
Planning time: 0.609 ms
Execution time: 14315.981 ms
(8 rows)发布于 2016-08-03 20:38:49
看起来提要的限制要大得多,只返回~2k记录,大部分时间用于构建/过滤时间。
如果通常是这样的话,考虑将其提取到CTE中,并只过滤返回元组上的时间--它只需要扫描2k记录,并且可能会避免索引。
问候乔尼
https://dba.stackexchange.com/questions/145790
复制相似问题