MySQL去重：DISTINCT vs GROUP BY，谁才是效率之王？

俊才

发布于 2026-02-04 14:35:46

340

在日常数据库查询中，去除重复数据是最常见的需求之一。面对这种需求，很多开发者习惯随手用 DISTINCT，也有人觉得 GROUP BY 更 “万能”，但很少有人深究：这两者底层原理有何不同？在不同场景下谁的效率更高？今天我们就从原理、效率、使用场景三个维度，彻底讲清楚这对 “去重兄弟” 的区别。

一、一个实际问题的起源

今天，团队的新成员小陈遇到了一个难题：他需要从用户表中统计不重复的城市列表，但查询速度极其缓慢。他尝试了两种写法：

SELECT DISTINCT city FROM users;

以及

SELECT city FROM users GROUP BY city;

但不确定哪种写法更好，以及如何优化。他说看到网上有人说禁止使用distinct,但是他实际测的时候发现其实distinct有的情况下效率比group by又要好，所以更加不确定该怎么用了。对于今天这个经典问题，促使我们重新进行一番审视。

二、 DISTINCT与GROUP BY的本质差异

DISTINCT是SQL专门为去重设计的关键字，它的核心使命很简单——去除重复行，返回唯一值。

-- 单列去重
SELECT DISTINCT department FROM employees;
-- 多列去重（所有列组合相同时才去重）
SELECT DISTINCT department, job_title FROM employees;

GROUP BY的设计初衷是对数据进行分组，以便进行聚合计算（如COUNT、SUM、AVG等）。

-- 计算每个部门的平均工资
SELECT department, AVG(salary) 
FROM employees GROUP BY department;

虽然GROUP BY也可以实现去重效果，但这其实是它分组行为的"副产品"。

三、性能对决

说到大家最关心的性能问题，其实和是否有索引索引有很大的关系。

当去重列有索引时，DISTINCT和GROUP BY都可以利用索引进行高效扫描，性能差异微乎其微。两者甚至可能使用相同的执行计划。

mysql> explain select  distinct registration_time from customers2;
+----+-------------+------------+------------+-------+---------------------------------------+-----------------------+---------+------+-------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys                         | key                   | key_len | ref  | rows  | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------------------------------+-----------------------+---------+------+-------+----------+-------------+
|  1 | SIMPLE      | customers2 | NULL       | index | idx_registration_time,idx_vip_regtime | idx_registration_time | 4       | NULL | 53014 |   100.00 | Using index |
+----+-------------+------------+------------+-------+---------------------------------------+-----------------------+---------+------+-------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select   registration_time from customers2 group by registration_time;
+----+-------------+------------+------------+-------+---------------------------------------+-----------------------+---------+------+-------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys                         | key                   | key_len | ref  | rows  | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------------------------------+-----------------------+---------+------+-------+----------+-------------+
|  1 | SIMPLE      | customers2 | NULL       | index | idx_registration_time,idx_vip_regtime | idx_registration_time | 4       | NULL | 53014 |   100.00 | Using index |
+----+-------------+------------+------------+-------+---------------------------------------+-----------------------+---------+------+-------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

真正的性能分水岭出现在无索引的情况下：

在MySQL 8.0之前，GROUP BY会进行隐式排序，可能触发filesort，导致性能下降；而DISTINCT不需要这种排序，因此在无索引时性能更优。

mysql> explain select distinct c1 from  tb2  ;
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows     | filtered | Extra           |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-----------------+
|  1 | SIMPLE      | tb2   | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 15538816 |   100.00 | Using temporary |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-----------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select  c1 from  tb2 group by c1 ;
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows     | filtered | Extra                           |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+---------------------------------+
|  1 | SIMPLE      | tb2   | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 15538816 |   100.00 | Using temporary; Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+---------------------------------+
1 row in set, 1 warning (0.00 sec)

不过，如果查询结果无需排序，则可以显示写上ORDER BY NULL来取消group by结果的排序，例如：

mysql> explain select  c1 from  tb2 group by c1 ;
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows     | filtered | Extra                           |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+---------------------------------+
|  1 | SIMPLE      | tb2   | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 15538816 |   100.00 | Using temporary; Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+---------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select  c1 from  tb2 group by c1  ORDER BY NULL;
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows     | filtered | Extra           |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-----------------+
|  1 | SIMPLE      | tb2   | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 15538816 |   100.00 | Using temporary |
+----+-------------+-------+------------+------+---------------+------+---------+------+----------+----------+-----------------+
1 row in set, 1 warning (0.00 sec)

但这里有个重要更新：从MySQL 8.0开始，GROUP BY的隐式排序已被移除，两者在无索引情况下的性能差异大大缩小。


mysql> explain select   age from customers2 group by age;
+----+-------------+------------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra           |
+----+-------------+------------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
|  1 | SIMPLE      | customers2 | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 53014 |   100.00 | Using temporary |
+----+-------------+------------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select  distinct age from customers2;
+----+-------------+------------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra           |
+----+-------------+------------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
|  1 | SIMPLE      | customers2 | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 53014 |   100.00 | Using temporary |
+----+-------------+------------+------------+------+---------------+------+---------+------+-------+----------+-----------------+
1 row in set, 1 warning (0.00 sec)

四、总结

下面我们通过一张表来总结一下搞定逻辑

场景	推荐用法	核心原因
纯单字段、多字段去重（无聚合）	DISTINCT	逻辑更简洁，无索引时效率略高
去重 + 聚合（COUNT/SUM/AVG 等）	GROUP BY	唯一能实现聚合的方式
去重 + 过滤分组结果	GROUP BY + HAVING	DISTINCT 无分组过滤能力
有索引的任意去重场景	两者皆可	索引消除了效率差异，按可读性选择

最后记住核心原则：DISTINCT 是去重的 “专用工具”，GROUP BY 是分组聚合的 “全能工具” —— 按需选择，而非盲目依赖某一个。

在实际开发中，不用纠结 “谁绝对更好”，而是先明确业务需求（是否聚合、是否过滤），再结合索引情况选择；写完 SQL 后，养成看执行计划（EXPLAIN）的习惯，通过 Extra 字段（Using temporary/Using filesort）判断是否需要优化，这才是提升 MySQL 性能的核心思路。

关注微信公众号「数据库干货铺」，获取更多数据库运维干货。

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2026-01-29，如有侵权请联系 cloudcommunity@tencent.com 删除

mysql