前两天遇到一个问题,是关于一个SQL的执行计划的,大概是这么个SQL:
select C from table where A=1 and B>date_a and B<date_b group by C
对应的表结构是:
create table tbl_name (
A XXX,
B XXX,
C XXX,
index(A,B,C)
);
从表结构中可以看出来,这个表是使用了ABC为顺序的联合索引,然后查看这个表的执行计划,如下(为了方便说明,我们设计了一个简易的表来替代):
mysql 22:28:02>>create table tbl_name(a int,b int,c int,index `idx_abc`(a,b,c));
Query OK, 0 rows affected (0.07 sec)
mysql 22:28:52>>insert into tbl_name values (1,1,1),(2,2,2),(3,3,3);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql 22:29:18>>insert into tbl_name values (1,1,6),(2,2,5),(3,3,4);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql 22:29:40>>insert into tbl_name select * from tbl_name;
Query OK, 6 rows affected (0.00 sec)
Records: 6 Duplicates: 0 Warnings: 0
mysql 22:29:55>>insert into tbl_name select * from tbl_name;
Query OK, 12 rows affected (0.01 sec)
Records: 12 Duplicates: 0 Warnings: 0
mysql 22:29:57>>explain select c from tbl_name where a=1 and b>=2 and b<=3 group by c;
+----+-------------+----------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------------------------------------------+
| 1 | SIMPLE | tbl_name | NULL | range | idx_abc | idx_abc | 10 | NULL | 1 | 100.00 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
从表中不难看出来执行计划中反应出Using temporary和Using filesort这两个额外的信息,这里还是有一些疑问的,第一,命名没有进行order by 的操作,为什么额外的信息中会有filesort?其次,这个临时表又是干什么用的?
要回答这个问题,我们先从官方文档中查看一下信息:
In some cases, the server creates internal temporary tables while processing statements. Users have no direct control over when this occurs.
ORDER BY
clause and a different GROUP BY
clause, or for which the ORDER BY
or GROUP BY
contains columns from tables other than the first table in the join queue.DISTINCT
combined with ORDER BY
may require a temporary table.这里的意思是说,在某些情况下,服务器会创建临时表来处理语句,这个处理过程用户没有办法干预。然后列举了几种情况,其中的两条是:如果语句包含order by 或者包含group by语句,则可能用到临时表,或者order by和group by语句在关联查询的后面的表里出现,则可能会用到临时表;如果distinct和order by同时出现的时候,可能会用到临时表。
通俗的来讲,在许多查询的执行过程中,MySQL可能会借助临时表来完成一些功能,比如去重、排序之类的,比如我们在执行许多包含DISTINCT、GROUP BY、UNION等子句的查询过程中,如果不能有效利用索引来完成查询,MySQL很有可能寻求通过建立内部的临时表来执行查询。如果查询中使用到了内部的临时表,在执行计划的Extra列将会显示Using temporary提示。
到这里就能解释为什么会用到临时表了。这里我们注意到,查询计划中还有一个using filesort的关键字,我们的SQL看起来并没有执行order by的语句,为什么会出现filesort的语句呢?其实这个问题的本质还是由于mysql帮我们做了优化,默认按照分组的字段进行排序,如果我们不想要这个排序的功能,可以使用null来显示的禁止掉这个功能,如下:
mysql 22:29:57>>explain select c from tbl_name where a=1 and b>=2 and b<=3 group by c order by null;
+----+-------------+----------+------------+-------+---------------+---------+---------+------+------+----------+-------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+------+----------+-------------------------------------------+
| 1 | SIMPLE | tbl_name | NULL | range | idx_abc | idx_abc | 10 | NULL | 1 | 100.00 | Using where; Using index; Using temporary |
+----+-------------+----------+------------+-------+---------------+---------+---------+------+------+----------+-------------------------------------------+
1 row in set, 1 warning (0.00 sec)
如果我们使用了order by null的关键字,那么explain的结果中就没有using filesort了,就是说不主动进行文件排序了,如果你访问的结果集比较大,这其实是一种优化的方式。