相关 《Postgresql源码(61)查询执行——最外层Portal模块》 《Postgresql源码(62)查询执行——子模块ProcessUtility》 《Postgresql源码(63)查询执行——子模块Executor(1)》
因为时间关系Executor分析拆成几篇,这是第一篇。
PG中的SQL在经过语法解析、查询编译后,进入执行模块,整形模块的分三个子模块:
入口:portal子模块(下图蓝色)
处理DML的Executor子模块(下图绿色)
处理DDL的ProcessUtility子模块(下图橙色)
SQL会在查询编译阶段得到plantree_list,在portal模块启动时(函数PortalStart),根据plantree_list中具体情况(函数ChoosePortalStrategy),来决定PortalStrategy的值,后面执行根据PortalStrategy来决定进入Executor还是ProcessUtility。
本篇重点分析Executor子模块。
测试SQL
drop table course;drop table teacher;drop table teach_course;
create table course(no serial, name varchar, credit int,primary key(no));
insert into course(name, credit) values('Natural', 50);
insert into course(name, credit) values('Math', 30);
insert into course(name, credit) values('Database System', 20);
create table teacher(no serial, name varchar, sex char(1), age int);
insert into teacher(name, sex, age) values('Jack', 'm', 33);
insert into teacher(name, sex, age) values('Jennifer', 'f', 30);
create table teach_course(tno int, cno int, stu_num int);
insert into teach_course(tno, cno, stu_num) values(1, 2, 60);
insert into teach_course(tno, cno, stu_num) values(2, 3, 50);
select t.name, c.name, stu_num
from course as c, teach_course as tc, teacher as t
where c.no = tc.cno and tc.tno = t.no and c.name = 'Database System' and t.name = 'Jennifer';
执行计划
explain select t.name, c.name, stu_num
from course as c, teach_course as tc, teacher as t
where c.no = tc.cno and tc.tno = t.no and c.name = 'Database System' and t.name = 'Jennifer';
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=24.10..74.58 rows=1 width=68)
-> Hash Join (cost=23.95..62.61 rows=61 width=40)
Hash Cond: (tc.tno = t.no)
-> Seq Scan on teach_course tc (cost=0.00..30.40 rows=2040 width=12)
-> Hash (cost=23.88..23.88 rows=6 width=36)
-> Seq Scan on teacher t (cost=0.00..23.88 rows=6 width=36)
Filter: ((name)::text = 'Jennifer'::text)
-> Index Scan using course_pkey on course c (cost=0.15..0.20 rows=1 width=36)
Index Cond: (no = tc.cno)
Filter: ((name)::text = 'Database System'::text)
Executor执行由完成Portal模块拉动,核心函数:
需要的全部数据全部封装在QueryDesc中,包括计划树。
我们先看下执行计划和下图中的node树:
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=24.10..74.58 rows=1 width=68)
-> Hash Join (cost=23.95..62.61 rows=61 width=40)
Hash Cond: (tc.tno = t.no)
-> Seq Scan on teach_course tc (cost=0.00..30.40 rows=2040 width=12)
-> Hash (cost=23.88..23.88 rows=6 width=36)
-> Seq Scan on teacher t (cost=0.00..23.88 rows=6 width=36)
Filter: ((name)::text = 'Jennifer'::text)
-> Index Scan using course_pkey on course c (cost=0.15..0.20 rows=1 width=36)
Index Cond: (no = tc.cno)
Filter: ((name)::text = 'Database System'::text)
我们发现PG执行计划每个节点都是由两个子节点返回数据的(实际上计划树的每个node都是0-2进1出的结构)。在拿到一条执行计划后,直观上可以理解为上层节点(Nested Loop)首先执行,执行是通过自己的两个子节点(Hash Join、Index Scan)拿到数据,子节点又通过自己的子节点拿到数据。这样层层驱动整个计划树的运行。
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=24.10..74.58 rows=1 width=68)
-> Hash Join (cost=23.95..62.61 rows=61 width=40)
Hash Cond: (tc.tno = t.no)
-> Seq Scan on teach_course tc (cost=0.00..30.40 rows=2040 width=12)
-> Hash (cost=23.88..23.88 rows=6 width=36)
-> Seq Scan on teacher t (cost=0.00..23.88 rows=6 width=36)
Filter: ((name)::text = 'Jennifer'::text)
-> Index Scan using course_pkey on course c (cost=0.15..0.20 rows=1 width=36)
Index Cond: (no = tc.cno)
Filter: ((name)::text = 'Database System'::text)