SQL :如何查询“people he or she may know”?

内容来源于 Stack Overflow,并遵循CC BY-SA 3.0许可协议进行翻译与使用

  • 回答 (1)
  • 关注 (0)
  • 查看 (8)

我有表如下:

FRIEND (TABLE)
-----------------------------------
PLAYER_ID(PK,FK)   FRIEND_ID(PK,FK)
-----------------------------------
1                  2                 // 1 knows 2
2                  1                 // 2 knows 1
1                  3                 // 1 knows 3
2                  3                 // 2 knows 3
2                  4                 // 2 knows 4
2                  5                 // 2 knows 5 // updated
3                  5                 // 3 knows 5 // updated
1                  100
1                  200
1                  300
100                400
200                400
300                400

我有这样的:

ACQUAINTANCE (VIEW)
-----------------------------------
PLAYER_ID(PK,FK)   FRIEND_ID(PK,FK)
-----------------------------------
1                  2                 // 1 knows 2
2                  1                 // 2 knows 1

1 knows 2
2 knows 3

但被排除在外是因为

1 already knows 3

基本上我想为given player

people he or she may know
which is not himself or herself // this is nothing but obvious
which each is not already known to himself

结果:

by 1 -> 2 -> 4 and 1 -> 3 -> 5

4 and 5 can be suggested for 1 as 'people you may know'

order by number of mutual friends will be perfect
but I don't think I can understand even if someone show me how. sorry.

首先,添加表:

SELECT *
FROM FRIEND F1 INNER JOIN FRIEND F2 ON F1.FRIEND_ID = F2.PLAYER_ID

结果:

+-----------+-----------+-----------+-----------+
| PLAYER_ID | FRIEND_ID | PLAYER_ID | FRIEND_ID |
+-----------+-----------+-----------+-----------+
|         1 |         2 |         2 |         1 |
|         1 |         2 |         2 |         3 |
|         1 |         2 |         2 |         4 |
|         1 |         2 |         2 |         5 |
|         1 |         3 |         3 |         5 |
|         2 |         1 |         1 |         2 |
|         2 |         1 |         1 |         3 |
|         2 |         3 |         3 |         5 |
+-----------+-----------+-----------+-----------+

F2.FRIEND_ID only

SELECT F2.FRIEND_ID
FROM FRIEND F1 INNER JOIN FRIEND F2 ON F1.FRIEND_ID = F2.PLAYER_ID

结果:

+-----------+
| FRIEND_ID |
+-----------+
|         1 |
|         3 |
|         4 |
|         5 |
|         5 |
|         2 |
|         3 |
|         5 |
+-----------+

只适用于1

SELECT F2.FRIEND_ID
FROM FRIEND F1 INNER JOIN FRIEND F2 ON F1.FRIEND_ID = F2.PLAYER_ID
WHERE F1.PLAYER_ID = 1;

结果

+-----------+
| FRIEND_ID |
+-----------+
|         1 |
|         3 |
|         4 |
|         5 |
|         5 |
+-----------+

而不是1

SELECT F2.FRIEND_ID
FROM FRIEND F1 INNER JOIN FRIEND F2 ON F1.FRIEND_ID = F2.PLAYER_ID
WHERE F1.PLAYER_ID = 1 
AND F2.FRIEND_ID != 1;

结果

+-----------+
| FRIEND_ID |
+-----------+
|         3 |
|         4 |
|         5 |
|         5 |
+-----------+

不是1的直接知识

SELECT F2.FRIEND_ID
FROM FRIEND F1 INNER JOIN FRIEND F2 ON F1.FRIEND_ID = F2.PLAYER_ID
WHERE F1.PLAYER_ID = 1
AND F2.FRIEND_ID != 1
AND F2.FRIEND_ID NOT IN (SELECT FRIEND_ID FROM FRIEND WHERE PLAYER_ID = 1);

结果:

+-----------+
| FRIEND_ID |
+-----------+
|         4 |
|         5 |
|         5 |
+-----------+

添加以下路径

1 -> 100 -> 400
1 -> 200 -> 400
1 -> 300 -> 400

最后一次查询打印(再次)

+-----------+
| FRIEND_ID |
+-----------+
|         4 |
|         5 |
|         5 |
|       400 |
|       400 |
|       400 |
+-----------+

最后,我得到了候选人:4,5,400

distinct当然是为了首要的目标

SELECT DISTINCT F2.FRIEND_ID
FROM FRIEND F1 INNER JOIN FRIEND F2 ON F1.FRIEND_ID = F2.PLAYER_ID
WHERE F1.PLAYER_ID = 1
AND F2.FRIEND_ID != 1
AND F2.FRIEND_ID NOT IN (SELECT FRIEND_ID FROM FRIEND WHERE PLAYER_ID = 1);

结果:

+-----------+
| FRIEND_ID |
+-----------+
|         4 |
|         5 |
|       400 |
+-----------+

现在,我们需要按双方的数量来订购。

这是每个候选人的共同朋友的数目。

+-----------+
| FRIEND_ID |
+-----------+
|         4 | 1 (1 -> 2 -> 4)
|         5 | 2 (1 -> 2 -> 5, 1 -> 3 -> 5)
|       400 | 3 (1 -> 100 -> 400, 1 -> 200 -> 400, 1 -> 300 -> 400)
+-----------+

如下:

SELECT F2.FRIEND_ID, COUNT(*)
FROM FRIEND F1 INNER JOIN FRIEND F2 ON F1.FRIEND_ID = F2.PLAYER_ID
WHERE F1.PLAYER_ID = 1
AND F2.FRIEND_ID != 1
AND F2.FRIEND_ID NOT IN (SELECT FRIEND_ID FROM FRIEND WHERE PLAYER_ID = 1)
GROUP BY F2.FRIEND_ID;

结果:

+-----------+----------+
| FRIEND_ID | COUNT(*) |
+-----------+----------+
|         4 |        1 |
|         5 |        2 |
|       400 |        3 |
+-----------+----------+

这样啊,原来是这么回事

SELECT F2.FRIEND_ID, COUNT(*) AS MFC
FROM FRIEND F1 INNER JOIN FRIEND F2 ON F1.FRIEND_ID = F2.PLAYER_ID
WHERE F1.PLAYER_ID = 1
AND F2.FRIEND_ID != 1
AND F2.FRIEND_ID NOT IN (SELECT FRIEND_ID FROM FRIEND WHERE PLAYER_ID = 1)
GROUP BY F2.FRIEND_ID
ORDER BY MFC DESC;

结果:

+-----------+-----+
| FRIEND_ID | MFC |
+-----------+-----+
|       400 |   3 |
|         5 |   2 |
|         4 |   1 |
+-----------+-----+

有人能确认一下吗?这个查询最优吗?当将其作为视图时,是否存在可能的性能问题?

提问于
用户回答回答于

SELECT `friend_id` AS `possible_friend_id`
FROM `friends`
WHERE `player_id` IN (        --selecting those who are known
    SELECT `friend_id`        --by freinds of #1
    FROM `friends`
    WHERE `player_id` = 1) 
AND `friend_id` NOT IN (      --but not those who are known by #1
    SELECT `friend_id`
    FROM `friends`
    WHERE `player_id` = 1)
AND NOT `friend_id` = 1       --and are not #1 himself
                              --if one is known by multiple people
                              --he'll be multiple time in the list
GROUP BY `possible_friend_id` --so we group
ORDER BY COUNT(*) DESC        --and order by amount of repeatings

扫码关注云+社区