集合运算

最近更新时间:2026-05-20 14:11:22

我的收藏

描述

Spark SQL 支持三种集合运算:UNION [ALL | DISTINCT]INTERSECT [ALL | DISTINCT]EXCEPT [ALL | DISTINCT]

语法

query { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] query

参数

子句/关键字
说明
UNION
返回两个查询结果的并集。默认行为等价于 UNION DISTINCT,即自动去重。
INTERSECT
返回两个查询结果的交集。默认行为等价于 INTERSECT DISTINCT,即自动去重。
EXCEPT
返回在第一个查询结果中存在、但在第二个查询结果中不存在的行。默认行为等价于 EXCEPT DISTINCT
ALL
保留重复行,不进行去重。
DISTINCT
对结果进行去重,只返回唯一的行。默认行为,可省略不写。

示例

-- UNION ALL
SELECT 1 AS id UNION ALL SELECT 2 AS id UNION ALL SELECT 1 AS id

-- UNION DISTINCT
SELECT 1 AS id UNION SELECT 2 AS id UNION SELECT 1 AS id

-- INTERSECT
SELECT 1 AS id UNION ALL SELECT 2 INTERSECT SELECT 2

-- EXCEPT
SELECT 1 UNION ALL SELECT 2 EXCEPT SELECT 1