员工架构
(ID, EMPLOYEENAME, SALARY, ORGANIZATIONID)
要解决的查询:在没有连接的情况下,查找每个组织中具有最大工资的员工姓名。
SELECT E.*
FROM EMPLOYEE E,
(SELECT EMP.ORGANIZATIONID, MAX(EMP.SALARY)
FROM EMPLOYEE EMP
GROUP BY EMP.ORGANIZATIONID) MAXSALARY
WHERE MAXSALARY.SALARY =E.SALARY
AND E.ORGANIZATIONID=EMP.ORGANIZATIONID ;
有什么方法可以避免连接吗?我使用的是Spark SQL API,而joins会导致额外的随机操作,这是非常昂贵的。有没有办法在获得最高工资的同时获得员工姓名?
发布于 2018-06-10 04:19:11
您可以将PARTITION BY
与Spark SQL一起使用,如下所示(尽管它需要一个子查询)
SELECT E.*
FROM
(SELECT EMP.EMPLOYEENAME, EMP.ORGANIZATIONID, EMP.SALARY,
row_number() OVER (PARTITION BY ORGANIZATIONID ORDER BY SALARY DESC) as rank
FROM EMPLOYEE EMP
) AS E
WHERE E.rank=1
发布于 2018-06-10 04:28:23
试试这个:
SELECT P.ORGANIZATIONID, P.EMPLOYEENAME
FROM EMPLOYEE P
WHERE P.SALARY = (SELECT MAX(E.SALARY) FROM EMPLOYEE E WHERE P.ORGANIZATIONID = E.ORGANIZATIONID)
GROUP BY P.ORGANIZATIONID, P.EMPLOYEENAME
发布于 2018-06-10 04:58:37
试试这个:
SELECT EMPLOYEENAME FROM EMPLOYEE
WHERE SALARY IN (SELECT MAX(SALARY) FROM EMPLOYEE GROUP BY ORGANIZATIONID)
https://stackoverflow.com/questions/50777974
复制相似问题