# 为什么 EXISTS(NOT EXIST) 与 JOIN(LEFT JOIN) 的性能会比 IN(NOT IN) 好

## IN 改为 JOIN/EXIST

```SELECT
*
FROM
tbl1
WHERE
col3 IN ( SELECT col3 FROM tbl2 )```

```SELECT
*
FROM
tbl1
WHERE
EXISTS ( SELECT 1 FROM tbl2 WHERE tbl1.col3 = tbl2.col3 )```

```SELECT
*
FROM
tbl1
INNER JOIN tbl2 ON tbl1.col3 = tbl2.col3```

## INNER JOIN

```SELECT
tbl1.col1,
tbl2.col2
FROM
tbl1 INNER JOIN tbl2 USING ( col3 )
WHERE
tbl1.col1 IN ( 5, 6 );```

MySQL 执行的伪代码：

```// WHERE tbl1.col1 IN ( 5, 6 ) 筛选出 tb11 符合条件的记录
outer_iter = iterator over tbl1 where col1 IN(5,6)
outer_row = outer_iter.next
while outer_row
// 用 tb11 的 col3 去 tbl2 表中查询，有索引将会非常快
inner_iter = iterator over tbl2 where col3 = outer_row.col3
inner_row = inner_iter.next
// 可能会命中多条数据
while inner_row
output [ outer_row.col1, inner_row.col2 ]
inner_row = inner_iter.next
end
outer_row = outer_iter.next
end```

## LEFT JOIN

```SELECT
tbl1.col1,
tbl2.col2
FROM
tbl1
LEFT OUTER JOIN tbl2 USING ( col3 )
WHERE
tbl1.col1 IN ( 5, 6 );```

MySQL 执行的伪代码：

```// WHERE tbl1.col1 IN ( 5, 6 ) 筛选出 tb11 符合条件的记录
outer_iter = iterator over tbl1 where col1 IN(5,6)
outer_row = outer_iter.next
while outer_row
// 用 tb11 的 col3 去 tbl2 表中查询，有索引将会非常快
inner_iter = iterator over tbl2 where col3 = outer_row.col3
inner_row = inner_iter.next
if inner_row
// 可能会命中多条数据
while inner_row
output [ outer_row.col1, inner_row.col2 ]
inner_row = inner_iter.next
end
else
// 没有命中的则返回 NULL
output [ outer_row.col1, NULL ]
end
outer_row = outer_iter.next
end```

## MySQL Exist 语法执行逻辑

```SELECT DISTINCT
film.film_id
FROM
sakila.film INNER JOIN sakila.film_actor USING ( film_id );```

```SELECT
film_id
FROM
sakila.film
WHERE
EXISTS ( SELECT * FROM sakila.film_actor WHERE film.film_id = film_actor.film_id );```

## MySQL IN 语法的执行逻辑

The reason for supporting row comparisons for `IN` but not for the others is that `IN` is implemented by rewriting it as a sequence of `=`comparisons and `AND` operations. This approach cannot be used for `ALL`, `ANY`, or `SOME`.

In many database servers, IN() is just a synonym for multiple OR clauses, because the two are logically equivalent. Not so in MySQL, which sorts the values in the IN() list and uses a fast binary search to see whether a value is in the list. This is O(log n) in the size of the list, whereas an equivalent series of OR clauses is O(n) in the size of the list (i.e., much slower for large lists).

### 举个栗子

```SELECT
*
FROM
tbl1
WHERE
col3 IN (SELECT col3 FROM tbl2)```

```SELECT
*
FROM
tbl1
WHERE
col3 = '1'
OR col3 = '2'```

`OR` 语句实际上又会被处理成 `UNION ALL` ，所以最后执行的语句类似于：

```SELECT
*
FROM
tbl1
WHERE
col3 = '1'
UNION ALL
SELECT
*
FROM
tbl1
WHERE
col3 = '2'```

### MySQL 可能对IN查询做的优化

```SELECT
*
FROM
sakila.film
WHERE
film_id IN ( SELECT film_id FROM sakila.film_actor WHERE actor_id = 1 );```

```SELECT GROUP_CONCAT(film_id) FROM sakila.film_actor WHERE actor_id = 1;
-- Result: 1,23,25,106,140,166,277,361,438,499,506,509,605,635,749,832,939,970,980

SELECT * FROM sakila.film
WHERE film_id
IN(1,23,25,106,140,166,277,361,438,499,506,509,605,635,749,832,939,970,980);```

```SELECT
*
FROM
sakila.film
WHERE
EXISTS ( SELECT * FROM sakila.film_actor WHERE actor_id = 1 AND film_actor.film_id = film.film_id );```

## NOT IN 改成 NOT EXIST/LEFT JOIN

```SELECT
*
FROM
tbl1
WHERE
col3 NOT IN ( SELECT col3 FROM tbl2 )```

```SELECT
*
FROM
tbl1
WHERE
NOT EXISTS ( SELECT 1 FROM tbl2 WHERE tbl1.col3 = tbl2.col3 )```

```SELECT
*
FROM
tbl1
LEFT JOIN tbl2 ON tbl1.col3 = tbl2.col3
WHERE
tbl2.col3 IS NULL```

### NOT EXISTS 的执行计划

NOT EXISTS 查询：

```EXPLAIN SELECT
film_id,
language_id
FROM
sakila.film
WHERE
NOT EXISTS ( SELECT * FROM sakila.film_actor WHERE film_actor.film_id = film.film_id ) \G```

```*************************** 1. row ***************************
id: 1
230 | Chapter 6: Query Performance Optimization
select_type: PRIMARY
table: film
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 951
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: film_actor
type: ref
possible_keys: idx_fk_film_id
key: idx_fk_film_id
key_len: 2
ref: film.film_id
rows: 2
Extra: Using where; Using index```

### LEFT JOIN 执行计划

LEFT JOIN 查询：

```EXPLAIN SELECT
film.film_id,
film.language_id
FROM
sakila.film
LEFT OUTER JOIN sakila.film_actor USING ( film_id )
WHERE
film_actor.film_id IS NULL \G```

```*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: film
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 951
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: film_actor
type: ref
possible_keys: idx_fk_film_id
key: idx_fk_film_id
key_len: 2
ref: sakila.film.film_id
rows: 2
Extra: Using where; Using index; Not exists```

2018.9.16 19:50

0 条评论

• ### 1万步21天钉钉运动大神赛

目前所在的企业，从入职开始到现在，内部沟通协作的主要软件也就是钉钉。三月初的时候，偶然在钉钉运动一栏中发现了这个活动，当时也没怎么犹豫，果断支付了2元契约金...

• ### 此处芯安是吾乡 ——Seal ring

版图生涯是一条工程师修炼技术与心境凝练的修行路。俗话说“工欲善其事必先利其器”，练就一套得心应手的手上功夫是必须要刷满的技能点，如同老木匠视如珍宝的工具箱。出征...

• ### 或许，9月12日的苹果新品发布会依然会让人失望！

正如每一次的苹果邀请函都带给我们不一样的感受一样，今年9月12日的苹果发布会的邀请函依然让我们浮想联翩。这或许是苹果以往营销套路的延续——通过给人一个极具想象力...

• ### 恒大入主法拉第，FF91真的要来了吗？

自恒大入主法拉第之后，FF91落地的步伐似乎开始逐步加快。在许家印到法拉第公司考察之后，FF91在落地到中国的步伐更加快速。就在上周，恒大法拉第未来智能汽车集团...

• ### 锤子手机推“无限屏”新机：让我们为老罗的创新点赞

人们对于锤子手机的发布会的期待其实更多的是在看罗永浩的“单口相声”。无论锤子手机如何，观众依然能够在锤子手机发布会上过一回嘴瘾。罗永浩对于自己，对于友商，对于行...

• ### 饿了么并入口碑，一场“非典型”新零售大战的开始

如果你了解互联网巨头的套路，你就会发现他们会不断用资本的力量去拓展自己的势力范围，并且建构一个能够将人们所有的生活场景全部都囊括其中的生态体系。饿了么并入口碑仅...

• ### 共享单车落幕：上海凤凰诉ofo或在情理之中

共享单车浩浩荡荡的发展历程伴随着资本的退却而最终落幕。尽管共享单车市场上摩拜投身美团总算是有了一个好归宿，哈罗单车对于三四线的深度布局实现了逆袭，ofo依然在仗...

• ### 申请高级项目经理

要评计算机系统集成项目经理必须先考软考的系统集成项管理工程师，如果想评计算机系统集成高级项目经理必须考软考的信息系统项目管理师。考试一年2次，每年的5月下旬和1...