上一篇说明了执行的框架,本篇深入分析执行细节。测试用例不变,还是分析之前的case。
执行计划:
explain select t.name, c.name, stu_num
from course as c, teach_course as tc, teacher as t
where c.no = tc.cno and tc.tno = t.no and c.name = 'Database System' and t.name = 'Jennifer';
QUERY PLAN
-----------------------------------------------------------------------------------
Nested Loop (cost=24.10..74.58 rows=1 width=68)
-> Hash Join (cost=23.95..62.61 rows=61 width=40)
Hash Cond: (tc.tno = t.no)
-> Seq Scan on teach_course tc (cost=0.00..30.40 rows=2040 width=12)
-> Hash (cost=23.88..23.88 rows=6 width=36)
-> Seq Scan on teacher t (cost=0.00..23.88 rows=6 width=36)
Filter: ((name)::text = 'Jennifer'::text)
-> Index Scan using course_pkey on course c (cost=0.15..0.20 rows=1 width=36)
Index Cond: (no = tc.cno)
Filter: ((name)::text = 'Database System'::text)
执行前的数据结构:
测试SQL
select t.name, c.name, stu_num
from course as c, teach_course as tc, teacher as t
where c.no = tc.cno and tc.tno = t.no and c.name = 'Database System' and t.name = 'Jennifer';
调试起点:b ExecutorRun
我们再看下这张图:
nestloop需要从外表(outer表)中(驱动表)顺序扫描拿一条,在从内表(inner表)中找这条能连上的。具体在这个执行计划中:
执行过程
ExecNestLoop
...
outerPlan = outerPlanState(node) // left node
innerPlan = innerPlanState(node) // right node
// loop until we return a qualifying join tuple
for (;;)
// 从outer表的计划上找到一条
outerTupleSlot = ExecProcNode(outerPlan)
// 完了去inner表里面拿一条(要能连得上的一条)
innerTupleSlot = ExecProcNode(innerPlan)
ExecIndexScan
for (;;)
// 【重要】注意这个函数,ExecScanFetch没有具体业务逻辑,这是个框架函数
// ExecScanFetch用accessMtd拿到元组,在用recheckMtd检查元组是否符合要求
// 这里用的是两个通用函数 IndexNext 和 IndexRecheck
slot = ExecScanFetch(node, accessMtd, recheckMtd)
// 进入bt堆栈
IndexNext
ExecHashJoin的逻辑是一个小型switch状态机,通过流转状态走完业务逻辑,读起来比较顺畅。
hashjoin会seqscan扫左表,同时把右表创建成一个哈希表(会带着过滤条件,并不是把所有元组都建到哈希表里面)
执行过程:
ExecHashJoinImpl
ExecHashTableCreate
// 拿一条左表中的数据
ExecHashJoinOuterGetTuple
// T_SeqScanState
slot = ExecProcNode(outerNode)
// 找到能连上的tuple所在的bucket
ExecHashGetBucketAndBatch
ExecHashGetSkewBucket
// 查找bucket
ExecScanHashBucket
// 找到连接元组
seqscan为什么那么费CPU又效率低。 答:无等待无IO连续下面几件事情:
上面5步不涉及IO、没有任何会sleep的逻辑,基本就是连续的函数调用、变量赋值,所以持续做会把单核打满。效率低是应为没条件一行一行全部都要扫一遍,慢是必然的。
SeqNext
scandesc = table_beginscan
table_scan_getnextslot
slot->tts_tableOid = RelationGetRelid(sscan->rs_rd)
heap_getnextslot
heapgettup_pagemode
heapgetpage
BufferGetPage
tuple->t_data = (HeapTupleHeader) PageGetItem((Page) dp, lpp)