有奖捉虫:云通信与企业服务文档专题,速来> HOT
TDSQL-C MySQL 版支持查看并行查询的执行计划,以及查看线程中哪些线程在执行并行查询计划。您可清晰了解到并行查询是如何在数据库中稳定生效,也可在并行查询执行过程中遇到问题时,帮助快速定位问题。 本文为您介绍查看并行查询的两种常用方法。

方法一:使用 EXPLAIN 语句

创建表示例:
CREATE TABLE lineitem (
L_ORDERKEY INTEGER NOT NULL,
L_PARTKEY INTEGER NOI NOLL,
L_SUPPKEY INTEGER NOT NULL,
L_LINENUMBER INTEGER NOT NULL,
L QUANTITY DECIMAL(15, 2) NOT NULL,
L_EXTENDEDPRICE DECIMAL (15, 2) NOT NULL,
L_DISCOUNT DECIMAL(15, 2) NOT NULL,
L_TAX DECIMAL(15, 2) NOT NULL,
L_RETURNFLAG CHAR(1) NOT NULL,
L_LINESTATUS CHAR(1) NOT NULL,
L_SHIPDATE DATE NOT NOLLY,
L_COMMITDATE DATE NOT NULL,
L RECEIPTDATE DATE NOT NULL,
L SHIPINSTRUCT CHAR(25) NOT NULL,
L_SHIPMODE CHAR(10) NOT NULL,
L_COMMENT VARCHAR(44) NOT NULL
);
插入的数据集来自 TPC-H。
示例 SQL 语句:
SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty
FROM lineitem
WHERE l_shipdate <= '1998-09-02'
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus;
本示例为 TPC-H Q1 的简化形式,是典型的报表运算。
执行计划打印语句(EXPLAIN):
EXPLAIN SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty
FROM lineitem
WHERE l_shipdate <= '1998-09-02'
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus;
查询结果:
MySQL [tpch100g]> explain SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty FROM lineitem WHERE l_shipdate <= '1998-09-02' GROUP BY l_returnflag, l_linestatus ORDER BY l_returnflag, l_linestatus;
+----+-------------+-------------+------------+------+---------------+------+---------+------+-----------+----------+-----------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------+------------+------+---------------+------+---------+------+-----------+----------+-----------------------------------------------------------+
| 1 | SIMPLE | lineitem | NULL | ALL | i_l_shipdate | NULL | NULL | NULL | 593184480 | 50.00 | Parallel scan (4 workers); Using where; Using temporary |
| 1 | SIMPLE | <sender1> | NULL | ALL | NULL | NULL | NULL | NULL | 0 | 0.00 | Send to (<receiver1>) |
| 1 | SIMPLE | <receiver1> | NULL | ALL | NULL | NULL | NULL | NULL | 0 | 0.00 | Receive from (<sender1>); Using temporary; Using filesort |
+----+-------------+-------------+------------+------+---------------+------+---------+------+-----------+----------+-----------------------------------------------------------+
3 rows in set, 1 warning (0.00 sec)
树状执行计划打印语句(EXPLAIN format=tree):
EXPLAIN format=tree query SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty
FROM lineitem
WHERE l_shipdate <= '1998-09-02'
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus;
查询结果:
MySQL [tpch100g]> explain format=tree SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty FROM lineitem WHERE l_shipdate <= '1998-09-02' GROUP BY l_returnflag, l_linestatus ORDER BY l_returnflag, l_linestatus\\G
*************************** 1. row ***************************
EXPLAIN: -> Sort: lineitem.L_RETURNFLAG, lineitem.L_LINESTATUS
-> Table scan on <temporary>
-> Final Aggregate using temporary table
-> PX Receiver (slice: 0; workers: 1)
-> PX Sender (slice: 1; workers: 4)
-> Table scan on <temporary>
-> Aggregate using temporary table
-> Filter: (lineitem.L_SHIPDATE <= DATE'1998-09-02') (cost=65449341.10 rows=296592240)
-> Parallel table scan on lineitem (cost=65449341.10 rows=593184480)

1 row in set (0.00 sec)
由上述结果可以看出:
并行查询计划将语句分布给了4个工作线程进行运算。
将聚合运算拆分为了上下段,用户线程和并行线程分别执行。
对 lineitem 表采用了并行扫描算子。
实例中树状执行计划打印(EXPLAIN format=tree query)相较于传统执行计划打印(EXPLAIN)效果更好。

方法二:线程列表查看

show processlist 命令的输出结果显示了有哪些线程在运行,不仅可以查看当前所有的连接数,还可以查看当前的连接状态帮助识别出有问题的查询语句等。 基于 show processlist 命令,TDSQL-C MySQL 版自研了 show parallel processlist 语句,帮助您过滤线程中非并行查询的线程,使用该命令行后,将只展示与并行查询有关的线程。 示例 SQL 语句:
SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty
FROM lineitem
WHERE l_shipdate <= '1998-09-02'
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus;
本示例为 TPC-H Q1 的简化形式,是典型的报表运算。 show processlist 查询结果:
mysql> show processlist;
+--------+-------------+-----------------+-----------+---------+-------+------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+-------------+-----------------+-----------+---------+-------+------------+------------------------------------------------------------------------------------------------------+
| 7 | tencentroot | 127.0.0.1:49238 | NULL | Sleep | 0 | | NULL |
| 11 | tencentroot | 127.0.0.1:49262 | NULL | Sleep | 0 | | NULL |
| 13 | tencentroot | 127.0.0.1:49288 | NULL | Sleep | 1 | | NULL |
| 237062 | tencentroot | localhost | tpch100g | Query | 24 | Scheduling | SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty FROM lineitem WHERE l_shipdate <= '199 |
| 237107 | tencentroot | localhost | NULL | Query | 0 | init | show processlist |
+--------+-------------+-----------------+-----------+---------+-------+------------+------------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)
show parallel processlist 查询结果:
mysql> show parallel processlist;
+--------+-------------+-----------+----------+---------+------+-------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+-------------+-----------+----------+---------+------+-------------+------------------------------------------------------------------------------------------------------+
| 237062 | tencentroot | localhost | tpch100g | Query | 18 | Scheduling | SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty FROM lineitem WHERE l_shipdate <= '199 |
| 237110 | | | | Task | 18 | Task runing | connection 237062, worker 0, task 1 |
| 237111 | | | | Task | 18 | Task runing | connection 237062, worker 1, task 1 |
| 237112 | | | | Task | 18 | Task runing | connection 237062, worker 2, task 1 |
| 237113 | | | | Task | 18 | Task runing | connection 237062, worker 3, task 1 |
+--------+-------------+-----------+----------+---------+------+-------------+------------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)
由上述结果可以看出:
上述查询由并行计划分布给四个 work 线程进行执行:user 仅有一行有显示,表明 ID 237062 为用户线程,将 SQL 语句执行计划下推至下面四个 work 线程中进行,通过 info 列可看到,这四个工作线程均在执行 task1。
每个线程均可查询出来,精准进行定位。
show parallel processlist 相较于 show processlist 可以精准查询到所有进行并行查询的线程,不被其余线程影响。

相关文档