TPC-H 测试

最近更新时间:2025-09-04 20:42:52

我的收藏

测试概述

提供 TDStore 列存引擎的 TPC-H 测试报告,作为 TDStore 引擎不同版本的性能对比基准。
说明:
TDStore 列存引擎功能在当前版本为实验特性,仅面向受邀用户开放。若您有需要可通过客户经理与我们联系并申请。
TPC-H 是由国际组织 TPC(事务处理效能委员会)制定的决策支持系统基准测试,其前身为1994年发布的 TPC-D 标准。该测试通过模拟真实商业场景(如多表关联分析、大数据量聚合),构建基于第三范式(3NF)的8表数据仓库模型,旨在评估系统处理复杂查询的综合能力。凭借对混合负载​(查询与数据更新)和 ACID 事务的严苛要求,TPC-H 成为学术界与工业界公认的权威评测工具,广泛应用于金融(信贷风险评估、反欺诈分析)、通信(用户行为挖掘)、政务(税收合规审计)等核心领域,推动商业智能技术向高可靠、高性能方向迭代。
TPC-H 的核心评价体系以 QphH@size 为核心指标,量化系统在特定数据规模(size)下每小时完成的复杂查询吞吐量(QphH),其通过测量查询响应时间、并发执行效率等参数,全面反映数据库的优化器性能与资源调度能力。相较于传统测试,TPC-H 基于真实生产环境建模,可精准评估大规模数据下的线性扩展能力与长事务稳定性,迫使厂商突破并行计算、索引优化等技术瓶颈。

整体测试流程

1. 准备测试环境。
2. 连接对等节点进行 参数配置
4.1 连接对等节点,构建测试数据库、表。
4.2 生成并导入测试数据。
4.3 对等节点执行ANALYZE TABLE
4.4 列存节点执行 Compaction。
5. 连接列存节点,执行测试
注意:
目前 TDStore 的设计中,执行 OLAP 复杂查询需要通过单独的列存访问地址,与执行 OLTP 查询的对等节点访问地址不同,使用时请注意此处。
OLTP 类访问地址:对等节点 VIP 地址及端口号可登录控制台,通过实例详情页面,在基本信息区域,内网地址字段获取对等节点 VIP,内网端口字段获取对等节点 Port。
OLAP 类访问地址:列存节点 VIP 地址及端口号可登录控制台,通过实例详情页面,在实例架构图区域,Columnar Node 中查看对应的 VIP Port 信息。

测试环境

硬件环境

节点类型
节点规格
节点个数
对等节点
16Core CPU/32GB Memory/增强型 SSD 云硬盘 300GB
3
列存节点
32Core CPU/128GB Memory/增强型 SSD 云硬盘 300GB
1
管理节点
4Core CPU/8GB Memory
3

软件版本

组件
软件版本
TDStore
v21.0.0
TPC-H
v3.0.1

参数配置

1. 连接数据库。
mysql -h ${对等节点host_ip} -P ${对等节点port} -u ${username} -p ${password}
说明:
${对等节点host_ip} ${对等节点port}分别表示对等节点 VIP 地址及端口号,可登录控制台,通过实例详情页面,在基本信息区域,内网地址字段获取对等节点 VIP,内网端口字段获取对等节点 Port。

2. 执行以下命令,进行环境调优。
# 在对等节点上执行
SET GLOBAL innodb_buffer_pool_size=2684354560;
SET GLOBAL innodb_adaptive_hash_index=0;
SET GLOBAL optimizer_switch='engine_condition_pushdown=off,batched_key_access=off,mrr=off';
SET GLOBAL max_heap_table_size=10737418240;
SET GLOBAL tmp_table_size=10737418240;
SET GLOBAL parallel_plan_cost_threshold=0;
SET GLOBAL parallel_scan_records_threshold=0;
SET GLOBAL parallel_scan_ranges_threshold=0;
SET GLOBAL histogram_generation_max_mem_size=200000000000;

# 给用户赋权用于导入数据
GRANT FILE,SUPER ON *.* TO 'user_name'@'%';FLUSH PRIVILEGES;

测试计划

准备压测工具

1. 下载压测工具 TPC-H。
2. 下载完成后解压文件,进入 TPC-H 解压后的目录。
unzip tpch-3.0.1.zip
cd TPC-H\\ V3.0.1/
3. 复制 makefile.suite
cd dbgen
cp makefile.suite Makefile
4. 修改 Makefile 文件中的 CCDATABASEMACHINEWORKLOAD 等参数定义。
vim Makefile
修改 CCDATABASEMACHINEWORKLOAD 等参数定义。
################
## CHANGE NAME OF ANSI COMPILER HERE
################
CC= gcc
# Current values for DATABASE are: INFORMIX, DB2, ORACLE,
# SQLSERVER, SYBASE, TDAT (Teradata)
# Current values for MACHINE are: ATT, DOS, HP, IBM, ICL, MVS,
# SGI, SUN, U2200, VMS, LINUX, WIN32
# Current values for WORKLOAD are: TPCH
DATABASE= MYSQL
MACHINE = LINUX
WORKLOAD = TPCH
5. 修改 tpcd.h 文件。
vim tpcd.h
在文件末尾添加新的宏定义。
#ifdef MYSQL
#define GEN_QUERY_PLAN ""
#define START_TRAN "START TRANSACTION"
#define END_TRAN "COMMIT"
#define SET_OUTPUT ""
#define SET_ROWCOUNT "limit %d;\\n"
#define SET_DBASE "use %s;\\n"
#endif
6. 编译文件。
make
7. 执行./dbgen --version,确认压测工具已安装成功。
[root@**** /data/TPC-H V3.0.1/dbgen]# ./dbgen -h
TPC-H Population Generator (Version 3.0.0 build 0)
Copyright Transaction Processing Performance Council 1994 - 2010
USAGE:
dbgen [-{vf}][-T {pcsoPSOL}]
[-s <scale>][-C <procs>][-S <step>]
dbgen [-v] [-O m] [-s <scale>] [-U <updates>]

Basic Options
===========================
-C <n> -- separate data set into <n> chunks (requires -S, default: 1)
-f -- force. Overwrite existing files
-h -- display this message
-q -- enable QUIET mode
-s <n> -- set Scale Factor (SF) to <n> (default: 1)
-S <n> -- build the <n>th step of the data/update set (used with -C or -U)
-U <n> -- generate <n> update sets
-v -- enable VERBOSE mode

Advanced Options
===========================
-b <s> -- load distributions for <s> (default: dists.dss)
-d <n> -- split deletes between <n> files (requires -U)
-i <n> -- split inserts between <n> files (requires -U)
-T c -- generate cutomers ONLY
-T l -- generate nation/region ONLY
-T L -- generate lineitem ONLY
-T n -- generate nation ONLY
-T o -- generate orders/lineitem ONLY
-T O -- generate orders ONLY
-T p -- generate parts/partsupp ONLY
-T P -- generate parts ONLY
-T r -- generate region ONLY
-T s -- generate suppliers ONLY
-T S -- generate partsupp ONLY

To generate the SF=1 (1GB), validation database population, use:
dbgen -vf -s 1

To generate updates for a SF=1 (1GB), use:
dbgen -v -U 1 -s 1

准备测试数据

1. 构建测试表。
1.1 连接数据库。
mysql -h ${对等节点host_ip} -P ${对等节点port} -u ${username} -p ${password}
1.2 执行以下命令,构建测试数据库、表。
CREATE DATABASE tpch;
USE tpch;
DROP TABLE IF EXISTS part;
CREATE TABLE part (P_PARTKEY INTEGER NOT NULL PRIMARY KEY,
P_NAME VARCHAR(55) NOT NULL,
P_MFGR CHAR(25) NOT NULL,
P_BRAND CHAR(10) NOT NULL,
P_TYPE VARCHAR(25) NOT NULL,
P_SIZE INTEGER NOT NULL,
P_CONTAINER CHAR(10) NOT NULL,
P_RETAILPRICE DECIMAL(15,2) NOT NULL,
P_COMMENT VARCHAR(23) NOT NULL );



DROP TABLE IF EXISTS supplier;
CREATE TABLE supplier (S_SUPPKEY INTEGER NOT NULL PRIMARY KEY,
S_NAME CHAR(25) NOT NULL,
S_ADDRESS VARCHAR(40) NOT NULL,
S_NATIONKEY INTEGER NOT NULL,
S_PHONE CHAR(15) NOT NULL,
S_ACCTBAL DECIMAL(15,2) NOT NULL,
S_COMMENT VARCHAR(101) NOT NULL);

DROP TABLE IF EXISTS partsupp;
CREATE TABLE partsupp (PS_PARTKEY INTEGER NOT NULL,
PS_SUPPKEY INTEGER NOT NULL,
PS_AVAILQTY INTEGER NOT NULL,
PS_SUPPLYCOST DECIMAL(15,2) NOT NULL,
PS_COMMENT VARCHAR(199) NOT NULL,
PRIMARY KEY (PS_PARTKEY, PS_SUPPKEY)
);

DROP TABLE IF EXISTS customer;
CREATE TABLE customer (C_CUSTKEY INTEGER NOT NULL PRIMARY KEY,
C_NAME VARCHAR(25) NOT NULL,
C_ADDRESS VARCHAR(40) NOT NULL,
C_NATIONKEY INTEGER NOT NULL,
C_PHONE CHAR(15) NOT NULL,
C_ACCTBAL DECIMAL(15,2) NOT NULL,
C_MKTSEGMENT CHAR(10) NOT NULL,
C_COMMENT VARCHAR(117) NOT NULL);



DROP TABLE IF EXISTS orders;
CREATE TABLE orders (O_ORDERKEY INTEGER NOT NULL PRIMARY KEY,
O_CUSTKEY INTEGER NOT NULL,
O_ORDERSTATUS CHAR(1) NOT NULL,
O_TOTALPRICE DECIMAL(15,2) NOT NULL,
O_ORDERDATE DATE NOT NULL,
O_ORDERPRIORITY CHAR(15) NOT NULL,
O_CLERK CHAR(15) NOT NULL,
O_SHIPPRIORITY INTEGER NOT NULL,
O_COMMENT VARCHAR(79) NOT NULL);

DROP TABLE IF EXISTS lineitem;
CREATE TABLE lineitem (L_ORDERKEY INTEGER NOT NULL,
L_PARTKEY INTEGER NOT NULL,
L_SUPPKEY INTEGER NOT NULL,
L_LINENUMBER INTEGER NOT NULL,
L_QUANTITY DECIMAL(15,2) NOT NULL,
L_EXTENDEDPRICE DECIMAL(15,2) NOT NULL,
L_DISCOUNT DECIMAL(15,2) NOT NULL,
L_TAX DECIMAL(15,2) NOT NULL,
L_RETURNFLAG CHAR(1) NOT NULL,
L_LINESTATUS CHAR(1) NOT NULL,
L_SHIPDATE DATE NOT NULL,
L_COMMITDATE DATE NOT NULL,
L_RECEIPTDATE DATE NOT NULL,
L_SHIPINSTRUCT CHAR(25) NOT NULL,
L_SHIPMODE CHAR(10) NOT NULL,
L_COMMENT VARCHAR(44) NOT NULL,
PRIMARY KEY (L_ORDERKEY, L_LINENUMBER)
);


DROP TABLE IF EXISTS nation;
CREATE TABLE nation (N_NATIONKEY INTEGER NOT NULL PRIMARY KEY,
N_NAME CHAR(25) NOT NULL,
N_REGIONKEY INTEGER NOT NULL,
N_COMMENT VARCHAR(152));

DROP TABLE IF EXISTS region;
CREATE TABLE region (R_REGIONKEY INTEGER NOT NULL PRIMARY KEY,
R_NAME CHAR(25) NOT NULL,
R_COMMENT VARCHAR(152));
2. 生成测试数据。
2.1 拷贝dbgendists.dss到指定目录。
mkdir tpch-100g
cd tpch-100g
cp ../dists.dss ./
cp ../dbgen ./
2.2 创建generate_data.sh
vim generate_data.sh
脚本内容如下所示,表示通过30个并发进程生成100GB测试数据集,数据按分片(chunk)策略拆分为30个独立文件。
#!/usr/bin/bash
date
for i in {1..30}
do
./dbgen -vf -s 100 -S $i -C 30 &
done
wait
./dbgen -vf -s 100 -T r
./dbgen -vf -s 100 -T n
date
2.3 执行 generate_data.sh
chmod 755 generate_data.sh
sh generate_data.sh
3. 导入数据(30并发)。
3.1 创建load_data.sh
vim load_data.sh
脚本内容如下:
#!/bin/bash
if [ "$#" -ne 6 ]; then
echo "Usage: $0 <host> <port> <user> <passwd> <dbname> <dir>"
exit 1
fi

host=$1
port=$2
user=$3
passwd=$4
dbname=$5
dir=$6
opts="-h $host -P $port -u$user -p$passwd -D$dbname"

date

for tbl in nation region
do
echo "Importing table: $tbl"
mysql $opts -e "SET GLOBAL local_infile = 1;set tdsql_bulk_load_allow_unsorted=1;set tdsql_bulk_load = 1;LOAD DATA LOCAL INFILE '${dir}/${tbl}.tbl' INTO TABLE $tbl FIELDS TERMINATED BY '|';"
done

for tbl in part customer supplier partsupp orders lineitem
do
for i in {1..30}
do
echo "Importing table data file: ${tbl}.tbl.$i"
mysql $opts -e "SET GLOBAL local_infile = 1;set tdsql_bulk_load_allow_unsorted=1;set tdsql_bulk_load = 1;LOAD DATA LOCAL INFILE '${dir}/${tbl}.tbl.$i' INTO TABLE $tbl FIELDS TERMINATED BY '|';"
done
done

wait
date
3.2 执行load_data.sh
chmod 755 load_data.sh
sh load_data.sh ${host_ip} ${port} ${username} ${password} tpch ${上一步生成测试数据所在目录}
4. 在对等节点执行ANALYZE TABLE
4.1 连接数据库。
mysql -h ${对等节点host_ip} -P ${对等节点port} -u ${username} -p ${password}
4.2 更新统计信息 。
USE tpch;
ANALYZE TABLE customer;
ANALYZE TABLE nation;
ANALYZE TABLE orders;
ANALYZE TABLE part;
ANALYZE TABLE partsupp;
ANALYZE TABLE region;
ANALYZE TABLE supplier;
ANALYZE TABLE lineitem;

ANALYZE TABLE customer UPDATE HISTOGRAM on C_CUSTKEY, C_NATIONKEY, C_ACCTBAL;
ANALYZE TABLE lineitem UPDATE HISTOGRAM on L_ORDERKEY, L_PARTKEY, L_SUPPKEY, L_LINENUMBER, L_QUANTITY, L_EXTENDEDPRICE, L_DISCOUNT, L_TAX, L_SHIPDATE, L_COMMITDATE, L_RECEIPTDATE;
ANALYZE TABLE nation UPDATE HISTOGRAM on N_NATIONKEY, N_REGIONKEY, N_NAME;
ANALYZE TABLE orders UPDATE HISTOGRAM on O_ORDERKEY, O_CUSTKEY, O_TOTALPRICE, O_ORDERDATE, O_ORDERPRIORITY, O_SHIPPRIORITY, o_orderstatus;
ANALYZE TABLE part UPDATE HISTOGRAM on P_PARTKEY, P_TYPE, P_SIZE, P_RETAILPRICE;
ANALYZE TABLE partsupp update histogram on PS_PARTKEY, PS_SUPPKEY, PS_AVAILQTY;
ANALYZE TABLE region update histogram on R_REGIONKEY, R_NAME;
ANALYZE TABLE supplier update histogram on S_SUPPKEY, S_NATIONKEY, S_ACCTBAL;
5. 列存节点执行 Compaction。
5.1 连接数据库。
mysql -h ${列存节点host_ip} -P ${列存节点port} -u ${username} -p ${password}
说明:
${列存节点host_ip} ${列存节点port}分别表示列存节点 VIP 地址及端口号,可登录控制台,通过实例详情页面,在实例架构图区域,Columnar Node 中查看对应的 VIP Port 信息。

5.2 执行 Compaction。
这一步大约需要90分钟左右,请耐心等待。
SET max_parallel_degree=32;
CALL dbms_admin.tdstore_force_full_compaction();

执行测试

1. 连接数据库。
mysql -h ${列存节点host_ip} -P ${列存节点port} -u ${username} -p ${password} -D tpch
说明:
${列存节点host_ip} ${列存节点port}分别表示列存节点 VIP 地址及端口号,可登录控制台,通过实例详情页面,在实例架构图区域,Columnar Node 中查看对应的 VIP Port 信息。
2. 依次执行以下22个 SQL。
# Q1
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date_sub('1998-12-01', interval '119' day)
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus;
# Q2
select s_acctbal,
s_name,
n_name,
p_partkey,
p_mfgr,
s_address,
s_phone,
s_comment
from
part,
supplier,
partsupp,
nation,
region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and p_size = 19
and p_type like '%NICKEL'
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'EUROPE'
and ps_supplycost = (
select
min(ps_supplycost)
from
partsupp,
supplier,
nation,
region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'EUROPE'
)
order by
s_acctbal desc,
n_name,
s_name,
p_partkey
LIMIT 100;

# Q3
select
l_orderkey,
sum(l_extendedprice * (1 - l_discount)) as revenue,
o_orderdate,
o_shippriority
from
customer,
orders,
lineitem
where
c_mktsegment = 'HOUSEHOLD'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < '1995-03-23'
and l_shipdate > '1995-03-23'
group by
l_orderkey,
o_orderdate,
o_shippriority
order by
revenue desc,
o_orderdate
LIMIT 10;

# Q4
select
o_orderpriority,
count(*) as order_count
from
orders
where
o_orderdate >= '1996-08-01'
and o_orderdate < date_add( '1996-08-01', interval '3' month)
and exists (
select
*
from
lineitem
where
l_orderkey = o_orderkey
and l_commitdate < l_receiptdate
)
group by
o_orderpriority
order by
o_orderpriority;

# Q5
select
n_name,
sum(l_extendedprice * (1 - l_discount)) as revenue
from
customer,
orders,
lineitem,
supplier,
nation,
region
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and l_suppkey = s_suppkey
and c_nationkey = s_nationkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'MIDDLE EAST'
and o_orderdate >= '1994-01-01'
and o_orderdate < date_add( '1994-01-01', interval '1' year)
group by
n_name
order by
revenue desc;

# Q6
select
sum(l_extendedprice * l_discount) as revenue
from
lineitem
where
l_shipdate >= '1994-01-01'
and l_shipdate < date_add( '1994-01-01' , interval '1' year)
and l_discount between 0.03 - 0.01 and 0.03 + 0.01
and l_quantity < 24;

# Q7
select
supp_nation,
cust_nation,
l_year,
sum(volume) as revenue
from
(
select
n1.n_name as supp_nation,
n2.n_name as cust_nation,
extract(year from l_shipdate) as l_year,
l_extendedprice * (1 - l_discount) as volume
from
supplier,
lineitem,
orders,
customer,
nation n1 FORCE INDEX (PRIMARY),
nation n2 FORCE INDEX (PRIMARY)
where
s_suppkey = l_suppkey
and o_orderkey = l_orderkey
and c_custkey = o_custkey
and s_nationkey = n1.n_nationkey
and c_nationkey = n2.n_nationkey
and (
(n1.n_name = 'CANADA' and n2.n_name = 'INDONESIA')
or (n1.n_name = 'INDONESIA' and n2.n_name = 'CANADA')
)
and l_shipdate between '1995-01-01' and '1996-12-31'
) as shipping
group by
supp_nation,
cust_nation,
l_year
order by
supp_nation,
cust_nation,
l_year;

# Q8
select
o_year,
sum(case
when nation = 'INDONESIA' then volume
else 0
end) / sum(volume) as mkt_share
from
(
select
extract(year from o_orderdate) as o_year,
l_extendedprice * (1 - l_discount) as volume,
n2.n_name as nation
from
part,
supplier,
lineitem,
orders,
customer,
nation n1,
nation n2,
region
where
p_partkey = l_partkey
and s_suppkey = l_suppkey
and l_orderkey = o_orderkey
and o_custkey = c_custkey
and c_nationkey = n1.n_nationkey
and n1.n_regionkey = r_regionkey
and r_name = 'ASIA'
and s_nationkey = n2.n_nationkey
and o_orderdate between '1995-01-01' and '1996-12-31'
and p_type = 'STANDARD BRUSHED COPPER'
) as all_nations
group by
o_year
order by
o_year;

# Q9
select
nation,
o_year,
sum(amount) as sum_profit
from
(
select
n_name as nation,
extract(year from o_orderdate) as o_year,
l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
from
part,
supplier,
lineitem,
partsupp,
orders,
nation
where
s_suppkey = l_suppkey
and ps_suppkey = l_suppkey
and ps_partkey = l_partkey
and p_partkey = l_partkey
and o_orderkey = l_orderkey
and s_nationkey = n_nationkey
and p_name like '%chiffon%'
) as profit
group by
nation,
o_year
order by
nation,
o_year desc;

# Q10
select
c_custkey,
c_name,
sum(l_extendedprice * (1 - l_discount)) as revenue,
c_acctbal,
n_name,
c_address,
c_phone,
c_comment
from
customer,
orders,
lineitem,
nation
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate >= '1993-06-01'
and o_orderdate < date_add( '1993-06-01' ,interval '3' month)
and l_returnflag = 'R'
and c_nationkey = n_nationkey
group by
c_custkey,
c_name,
c_acctbal,
c_phone,
n_name,
c_address,
c_comment
order by
revenue desc
LIMIT 20;

# Q11
select
ps_partkey,
sum(ps_supplycost * ps_availqty) as value
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'RUSSIA'
group by
ps_partkey having
sum(ps_supplycost * ps_availqty) > (
select
sum(ps_supplycost * ps_availqty) * 0.0000010000
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'RUSSIA'
)
order by
value desc;

# Q12
select
l_shipmode,
sum(case
when o_orderpriority = '1-URGENT'
or o_orderpriority = '2-HIGH'
then 1
else 0
end) as high_line_count,
sum(case
when o_orderpriority <> '1-URGENT'
and o_orderpriority <> '2-HIGH'
then 1
else 0
end) as low_line_count
from
orders,
lineitem
where
o_orderkey = l_orderkey
and l_shipmode in ('RAIL', 'MAIL')
and l_commitdate < l_receiptdate
and l_shipdate < l_commitdate
and l_receiptdate >= '1996-01-01'
and l_receiptdate < date_add( '1996-01-01', interval '1' year)
group by
l_shipmode
order by
l_shipmode;

# Q13
select
c_count,
count(*) as custdist
from
(
select
c_custkey,
count(o_orderkey) as c_count
from
customer left outer join orders on
c_custkey = o_custkey
and o_comment not like '%unusual%accounts%'
group by
c_custkey
) as c_orders
group by
c_count
order by
custdist desc,
c_count desc;

# Q14
select
100.00 * sum(case
when p_type like 'PROMO%'
then l_extendedprice * (1 - l_discount)
else 0
end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue
from
lineitem,
part
where
l_partkey = p_partkey
and l_shipdate >= '1996-01-01'
and l_shipdate < date_add( '1996-01-01', interval '1' month);

# Q15
WITH revenue0 AS (
SELECT
l_suppkey AS supplier_no,
SUM(l_extendedprice * (1 - l_discount)) AS total_revenue
FROM
lineitem
WHERE
l_shipdate >= DATE '1996-01-01'
AND l_shipdate < DATE '1996-01-01' + INTERVAL '3' MONTH
GROUP BY
l_suppkey
) SELECT
s_suppkey,
s_name,
s_address,
s_phone,
total_revenue
FROM
supplier,
revenue0
WHERE
s_suppkey = supplier_no
AND total_revenue = (
SELECT
MAX(total_revenue)
FROM
revenue0)
ORDER BY
s_suppkey;

# Q16
select
p_brand,
p_type,
p_size,
count(distinct ps_suppkey) as supplier_cnt
from
partsupp,
part
where
p_partkey = ps_partkey
and p_brand <> 'Brand#31'
and p_type not like 'PROMO BRUSHED%'
and p_size in (46, 26, 17, 35, 9, 25, 37, 7)
and ps_suppkey not in (
select
s_suppkey
from
supplier
where
s_comment like '%Customer%Complaints%'
)
group by
p_brand,
p_type,
p_size
order by
supplier_cnt desc,
p_brand,
p_type,
p_size;

# Q17
select
sum(l_extendedprice) / 7.0 as avg_yearly
from
lineitem,
part
where
p_partkey = l_partkey
and p_brand = 'Brand#15'
and p_container = 'WRAP PACK'
and l_quantity < (
select
0.2 * avg(l_quantity)
from
lineitem
where
l_partkey = p_partkey
);

# Q18
select
c_name,
c_custkey,
o_orderkey,
o_orderdate,
o_totalprice,
sum(l_quantity)
from
customer,
orders,
lineitem
where
o_orderkey in (
select
l_orderkey
from
lineitem
group by
l_orderkey having
sum(l_quantity) > 315
)
and c_custkey = o_custkey
and o_orderkey = l_orderkey
group by
c_name,
c_custkey,
o_orderkey,
o_orderdate,
o_totalprice
order by
o_totalprice desc,
o_orderdate
LIMIT 100;

# Q19
select
sum(l_extendedprice* (1 - l_discount)) as revenue
from
lineitem,
part
where
(
p_partkey = l_partkey
and p_brand = 'Brand#53'
and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
and l_quantity >= 8 and l_quantity <= 8+10
and p_size between 1 and 5
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#24'
and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
and l_quantity >= 17 and l_quantity <= 17+10
and p_size between 1 and 10
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#13'
and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
and l_quantity >= 23 and l_quantity <= 23+10
and p_size between 1 and 15
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
);

# Q20
select /*+set_Var(optimizer_switch='semijoin=off,subquery_materialization_cost_based=on')*/
s_name,
s_address
from
supplier,
nation
where
s_suppkey in (
select
ps_suppkey
from
partsupp
where
ps_partkey in (
select
p_partkey
from
part
where
p_name like 'lime%'
)
and ps_availqty > (
select
0.5 * sum(l_quantity)
from
lineitem
where
l_partkey = ps_partkey
and l_suppkey = ps_suppkey
and l_shipdate >= '1994-01-01'
and l_shipdate < date_add( '1994-01-01' ,interval '1' year)
)
)
and s_nationkey = n_nationkey
and n_name = 'MOZAMBIQUE'
order by
s_name;

# Q21
select
s_name,
count(*) as numwait
from
supplier,
lineitem l1,
orders,
nation
where
s_suppkey = l1.l_suppkey
and o_orderkey = l1.l_orderkey
and o_orderstatus = 'F'
and l1.l_receiptdate > l1.l_commitdate
and exists (
select
*
from
lineitem l2
where
l2.l_orderkey = l1.l_orderkey
and l2.l_suppkey <> l1.l_suppkey
)
and not exists (
select
*
from
lineitem l3
where
l3.l_orderkey = l1.l_orderkey
and l3.l_suppkey <> l1.l_suppkey
and l3.l_receiptdate > l3.l_commitdate
)
and s_nationkey = n_nationkey
and n_name = 'GERMANY'
group by
s_name
order by
numwait desc,
s_name
LIMIT 100;

# Q22
select
cntrycode,
count(*) as numcust,
sum(c_acctbal) as totacctbal
from
(
select
substr(c_phone from 1 for 2) as cntrycode,
c_acctbal
from
customer
where
substr(c_phone from 1 for 2) in
('26', '13', '21', '28', '11', '12', '19')
and c_acctbal > (
select
avg(c_acctbal)
from
customer
where
c_acctbal > 0.00
and substr(c_phone from 1 for 2) in
('26', '13', '21', '28', '11', '12', '19')
)
and not exists (
select
*
from
orders
where
o_custkey = c_custkey
)
) as custsale
group by
cntrycode
order by
cntrycode;
Q1执行示例:
tdsql [tpch]> select
-> l_returnflag,
-> l_linestatus,
-> sum(l_quantity) as sum_qty,
-> sum(l_extendedprice) as sum_base_price,
-> sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
-> sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
-> avg(l_quantity) as avg_qty,
-> avg(l_extendedprice) as avg_price,
-> avg(l_discount) as avg_disc,
-> count(*) as count_order
-> from
-> lineitem
-> where
-> l_shipdate <= date_sub('1998-12-01', interval '119' day)
-> group by
-> l_returnflag,
-> l_linestatus
-> order by
-> l_returnflag,
-> l_linestatus;
+--------------+--------------+-------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
| l_returnflag | l_linestatus | sum_qty | sum_base_price | sum_disc_price | sum_charge | avg_qty | avg_price | avg_disc | count_order |
+--------------+--------------+-------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
| A | F | 37734107.00 | 56586554400.73 | 53758257134.8700 | 55909065222.827692 | 25.522006 | 38273.129735 | 0.049985 | 1478493 |
| N | F | 991417.00 | 1487504710.38 | 1413082168.0541 | 1469649223.194375 | 25.516472 | 38284.467761 | 0.050093 | 38854 |
| N | O | 72862400.00 | 109281180147.22 | 103818304596.9637 | 107974813051.894902 | 25.502026 | 38248.692267 | 0.050000 | 2857122 |
| R | F | 37719753.00 | 56568041380.90 | 53741292684.6040 | 55889619119.831932 | 25.505794 | 38250.854626 | 0.050009 | 1478870 |
+--------------+--------------+-------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
4 rows in set (3.49 sec)

测试结果

查询 SQL
执行时间(s)
Q1
4.96
Q2
1.02
Q3
3.67
Q4
3.36
Q5
3.99
Q6
2.19
Q7
5.59
Q8
3.70
Q9
6.57
Q10
4.79
Q11
0.98
Q12
3.22
Q13
3.94
Q14
2.66
Q15
3.17
Q16
1.14
Q17
4.22
Q18
7.36
Q19
4.20
Q20
2.92
Q21
10.21
Q22
1.66
总计
85.52