云数据库 MySQL 支持查看并行查询的执行计划,以及查看线程中哪些线程在执行并行查询计划。您可清晰了解到并行查询是如何在数据库中稳定生效,也可在并行查询执行过程中遇到问题时,帮助快速定位问题。
本文为您介绍查看并行查询的两种常用方法。
方法一:使用 EXPLAIN 语句
示例 SQL 语句:
SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qtyFROM lineitemWHERE l_shipdate <= '1998-09-02'GROUP BY l_returnflag, l_linestatusORDER BY l_returnflag, l_linestatus;
本示例为 TPC-H Q1 的简化形式,是典型的报表运算。
执行计划打印语句(EXPLAIN):
EXPLAIN SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qtyFROM lineitemWHERE l_shipdate <= '1998-09-02'GROUP BY l_returnflag, l_linestatusORDER BY l_returnflag, l_linestatus;
查询结果:
MySQL [tpch100g]> explain SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty FROM lineitem WHERE l_shipdate <= '1998-09-02' GROUP BY l_returnflag, l_linestatus ORDER BY l_returnflag, l_linestatus;+----+-------------+-------------+------------+------+---------------+------+---------+------+-----------+----------+-----------------------------------------------------------+| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+-------------+------------+------+---------------+------+---------+------+-----------+----------+-----------------------------------------------------------+| 1 | SIMPLE | lineitem | NULL | ALL | i_l_shipdate | NULL | NULL | NULL | 593184480 | 50.00 | Parallel scan (4 workers); Using where; Using temporary || 1 | SIMPLE | <sender1> | NULL | ALL | NULL | NULL | NULL | NULL | 0 | 0.00 | Send to (<receiver1>) || 1 | SIMPLE | <receiver1> | NULL | ALL | NULL | NULL | NULL | NULL | 0 | 0.00 | Receive from (<sender1>); Using temporary; Using filesort |+----+-------------+-------------+------------+------+---------------+------+---------+------+-----------+----------+-----------------------------------------------------------+3 rows in set, 1 warning (0.00 sec)
树状执行计划打印语句(EXPLAIN format=tree):
EXPLAIN format=tree query SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qtyFROM lineitemWHERE l_shipdate <= '1998-09-02'GROUP BY l_returnflag, l_linestatusORDER BY l_returnflag, l_linestatus;
查询结果:
MySQL [tpch100g]> explain format=tree SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty FROM lineitem WHERE l_shipdate <= '1998-09-02' GROUP BY l_returnflag, l_linestatus ORDER BY l_returnflag, l_linestatus\\G*************************** 1. row ***************************EXPLAIN: -> Sort: lineitem.L_RETURNFLAG, lineitem.L_LINESTATUS-> Table scan on <temporary>-> Final Aggregate using temporary table-> PX Receiver (slice: 0; workers: 1)-> PX Sender (slice: 1; workers: 4)-> Table scan on <temporary>-> Aggregate using temporary table-> Filter: (lineitem.L_SHIPDATE <= DATE'1998-09-02') (cost=65449341.10 rows=296592240)-> Parallel table scan on lineitem (cost=65449341.10 rows=593184480)1 row in set (0.00 sec)
由上述结果可以看出:
并行查询计划将语句分布给了4个工作线程进行运算。
将聚合运算拆分为了上下段,用户线程和并行线程分别执行。
对 lineitem 表采用了并行扫描算子。
实例中树状执行计划打印(EXPLAIN format=tree query)相较于传统执行计划打印(EXPLAIN)效果更好。
方法二:线程列表查看
show processlist 命令的输出结果显示了有哪些线程在运行,不仅可以查看当前所有的连接数,还可以查看当前的连接状态帮助识别出有问题的查询语句等。
基于 show processlist 命令,云数据库 MySQL 自研了 show parallel processlist 语句,帮助您过滤线程中非并行查询的线程,使用该命令行后,将只展示与并行查询有关的线程。
示例 SQL 语句:
SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qtyFROM lineitemWHERE l_shipdate <= '1998-09-02'GROUP BY l_returnflag, l_linestatusORDER BY l_returnflag, l_linestatus;
本示例为 TPC-H Q1 的简化形式,是典型的报表运算。
show processlist 查询结果:
mysql> show processlist;+--------+-------------+-----------------+-----------+---------+-------+------------+------------------------------------------------------------------------------------------------------+| Id | User | Host | db | Command | Time | State | Info |+--------+-------------+-----------------+-----------+---------+-------+------------+------------------------------------------------------------------------------------------------------+| 7 | tencentroot | 127.0.0.1:49238 | NULL | Sleep | 0 | | NULL || 11 | tencentroot | 127.0.0.1:49262 | NULL | Sleep | 0 | | NULL || 13 | tencentroot | 127.0.0.1:49288 | NULL | Sleep | 1 | | NULL || 237062 | tencentroot | localhost | tpch100g | Query | 24 | Scheduling | SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty FROM lineitem WHERE l_shipdate <= '199 || 237107 | tencentroot | localhost | NULL | Query | 0 | init | show processlist |+--------+-------------+-----------------+-----------+---------+-------+------------+------------------------------------------------------------------------------------------------------+6 rows in set (0.00 sec)
show parallel processlist 查询结果:
mysql> show parallel processlist;+--------+-------------+-----------+----------+---------+------+-------------+------------------------------------------------------------------------------------------------------+| Id | User | Host | db | Command | Time | State | Info |+--------+-------------+-----------+----------+---------+------+-------------+------------------------------------------------------------------------------------------------------+| 237062 | tencentroot | localhost | tpch100g | Query | 18 | Scheduling | SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty FROM lineitem WHERE l_shipdate <= '199 || 237110 | | | | Task | 18 | Task runing | connection 237062, worker 0, task 1 || 237111 | | | | Task | 18 | Task runing | connection 237062, worker 1, task 1 || 237112 | | | | Task | 18 | Task runing | connection 237062, worker 2, task 1 || 237113 | | | | Task | 18 | Task runing | connection 237062, worker 3, task 1 |+--------+-------------+-----------+----------+---------+------+-------------+------------------------------------------------------------------------------------------------------+5 rows in set (0.00 sec)
由上述结果可以看出:
上述查询由并行计划分布给四个 work 线程进行执行:user 仅有一行有显示,表明 ID 237062为用户线程,将 SQL 语句执行计划下推至下面四个 work 线程中进行,通过 info 列可看到,这四个工作线程均在执行 task1。
每个线程均可查询出来,精准进行定位。
show parallel processlist 相较于 show processlist 可以精准查询到所有进行并行查询的线程,不被其余线程影响。