今天继续和大家分享 HackerRank 上的 SQL 编程挑战的解题思路,这一次的题目叫做“Occupations”,属于中等难度级别,答案提交的成功率在 90% 左右。
有一个职业表(OCCUPATIONS ),它有两个字段:Name、Occupation。其中,Occupation 的值只有四种:Doctor、Professor、Singer 和 Actor 。
OCCUPATIONS 表的表结构及数据如下:
Column | Type |
---|---|
Name | String |
Occupation | String |
OCCUPATIONS 的表结构
Name | Occupation |
---|---|
Samantha | Doctor |
Julia | Actor |
Maria | Actor |
Meera | Singer |
Ashley | Professor |
Ketty | Professor |
Christeen | Professor |
Jane | Actor |
Jenny | Doctor |
Priya | Singer |
OCCUPATIONS 的表数据
要求:
输出结果:
Jenny Ashley Meera Jane
Samantha Christeen Priya Julia
NULL Ketty NULL Maria
一般遇到“将一列中的所有唯一值作为新的列名输出”这种需求,我们首先想到的就应该是行转列,我之前写过一篇介绍SQL 行转列的通用实现的文章,感兴趣的朋友点进来看看。
做行转列时分组的依据是什么呢?即依据哪个字段分组。答案是依据每个职位中姓名的排序序号作为分组条件,而每个职位里面姓名的出现的序号可通过窗口函数求得。
先来看看对职业为 “Actor” 和 “Doctor” 的数据做组内排序。
SELECT
occupation as p,
row_number () over (
PARTITION BY occupation
ORDER BY name
) AS rn,
name
FROM
OCCUPATIONS
WHERE
occupation IN ('Actor','Doctor')
排序的结果:
p rn name
------ -- --------
Actor 1 Eve
Actor 2 Jennifer
Actor 3 Ketty
Actor 4 Samantha
Doctor 1 Aamina
Doctor 2 Julia
Doctor 3 Priya
最后,我们把窗口函数和行转列结合起来,就可以实现题目所要求的输出。
WITH t AS
(SELECT
occupation,
row_number () over (
PARTITION BY occupation
ORDER BY NAME
) AS rn,
NAME
FROM
OCCUPATIONS)
SELECT
MAX(
IF(occupation = 'Doctor', NAME, NULL)
) AS Doctor,
MAX(
IF(
occupation = 'Professor',
NAME,
NULL
)
) AS Professor,
MAX(
IF(occupation = 'Singer', NAME, NULL)
) AS Singer,
MAX(IF(occupation = 'Actor', NAME, NULL)) AS Actor
FROM
t
GROUP BY rn
如果你的 MySQL 数据库的版本在 8.0 之前,那么可以用用户变量替代窗口函数实现组内排序的功能。
SELECT
MAX(
IF(occupation = 'Doctor', NAME, NULL)
) AS Doctor,
MAX(
IF(
occupation = 'Professor',
NAME,
NULL
)
) AS Professor,
MAX(
IF(occupation = 'Singer', NAME, NULL)
) AS Singer,
MAX(IF(occupation = 'Actor', NAME, NULL)) AS Actor
FROM
(SELECT
NAME,
occupation,
@rn := IF(@oc = occupation, @rn + 1, 1) AS seq,
@oc := occupation
FROM
occupations a,
(SELECT
@rn := 0,
@oc := NULL) b
ORDER BY occupation,name
) t
GROUP BY seq