鉴于以下数据:
s = '{"PassengerId":{"0":1,"1":2,"2":3},"Survived":{"0":0,"1":1,"2":1},"Pclass":{"0":3,"1":1,"2":3}}'
df = pd.read_json(s)
它看起来是:
PassengerId Survived Pclass
0 1 0 3
1 2 1 1
2 3 1 3
假设它已经融化成
m = df.melt()
print(m)
variable value
0 PassengerId 1
1 PassengerId 2
2 PassengerId 3
3 Survived 0
4 Survived 1
5 Survived 1
6 Pclass 3
7 Pclass 1
8 Pclass 3
我想知道如何将融化的m
还原为原始的df
。
我尝试了以下类似的方法:
m=df.melt().pivot(columns='variable', values='value').reset_index(drop=True)
m.columns.name = None
这给
PassengerId Pclass Survived
0 1.0 NaN NaN
1 2.0 NaN NaN
2 3.0 NaN NaN
3 NaN NaN 0.0
4 NaN NaN 1.0
5 NaN NaN 1.0
6 NaN 3.0 NaN
7 NaN 1.0 NaN
8 NaN 3.0 NaN
可以看到,每一行只包含有关单个列的信息,其中有很多NaN
值,我想丢失。
发布于 2020-04-26 13:15:00
使用GroupBy.cumcount
表示用于index
参数的DataFrame.pivot
中的新列
m['new'] = m.groupby('variable').cumcount()
df = m.pivot(columns='variable', values='value', index='new')
print (df)
variable PassengerId Pclass Survived
new
0 1 3 0
1 2 1 1
2 3 3 1
或者:
df = (m.assign(new = m.groupby('variable').cumcount())
.pivot(columns='variable', values='value', index='new'))
print (df)
variable PassengerId Pclass Survived
new
0 1 3 0
1 2 1 1
2 3 3 1
https://stackoverflow.com/questions/61441377
复制相似问题