我有一个网球比赛的pandas数据集,每一场比赛都有赢家和输家:
match match_date winner loser score
match1 06June player1 player2 6-2
match2 08June player3 player1 6-4
match3 07June player1 player4 5-6
match4 12June player4 player3 6-7
数据集是基于比赛创建的,我需要为每个球员创建一个新的数据集:
player_name previous_match result score
player1 06June won 6-2
player1 08June lost 6-4
player1 07June lost 5-6
问题是每个玩家都可以出现在赢家列或输家列中,而且玩家的数量很大。
发布于 2020-06-10 18:55:46
您可以使用melt
melted_df = pd.melt(df, id_vars=['match',"match_date","score"], value_vars=['winner',"loser"], var_name="result", value_name="player_name")
melted_df
输出:
match match_date score result player_name
0 match1 06June 6-2 winner player1
1 match2 08June 6-4 winner player3
2 match3 07June 5-6 winner player1
3 match4 12June 6-7 winner player4
4 match1 06June 6-2 loser player2
5 match2 08June 6-4 loser player1
6 match3 07June 5-6 loser player4
7 match4 12June 6-7 loser player3
然后你就可以对它进行排序:
melted_df.sort_values(by="player_name")
输出:
match match_date score result player_name
0 match1 06June 6-2 winner player1
2 match3 07June 5-6 winner player1
5 match2 08June 6-4 loser player1
4 match1 06June 6-2 loser player2
1 match2 08June 6-4 winner player3
7 match4 12June 6-7 loser player3
3 match4 12June 6-7 winner player4
6 match3 07June 5-6 loser player4
https://stackoverflow.com/questions/62301588
复制相似问题