我想使用pandas功能将pandas.Dataframe转置为转置后的表格格式,这样所有的电话号码都应该在MSISD列下面提到,并且play_id应该有列名的值,如果它是phone1或phone2等等。
df是
df = pd.DataFrame({
'id': ['1', '2', '3'],
'play_id': ['20002075', '601731', '601731'],
'phone1': ['0900031349', '', ''],
'phone2': ['090891349', '', ''],
'phone3': ['', '', ''],
'phone4': ['', '', ''],
'phone5': ['', '088235311', ''],
'phone6': ['', '', ''],
'phone7': ['', '', '088235311']
})预期输出应为
id play_id msisd
1: 1 phone1 0900031349
2: 2 phone2 090891349发布于 2019-05-10 13:34:13
通过boolean indexing删除带有空字符串的值时使用DataFrame.melt
df1 = df.melt(['id','play_id'], value_name='val', var_name='phone')
df1 = df1[df1['val'] != '']
#if empty strings are NANs
#df1 = df1[df1['val'].notna()]
print (df1)
id play_id phone val
0 1 20002075 phone1 0900031349
3 1 20002075 phone2 090891349
13 2 601731 phone5 088235311
20 3 601731 phone7 088235311或者使用DataFrame.stack将空字符串替换为缺少的值:
df1 = (df.set_index(['id','play_id'])
.replace('', np.nan)
.stack()
.reset_index(name='val')
.rename(columns={'level_2':'phone'})
)
print (df1)
id play_id phone val
0 1 20002075 phone1 0900031349
1 1 20002075 phone2 090891349
2 2 601731 phone5 088235311
3 3 601731 phone7 088235311https://stackoverflow.com/questions/56071214
复制相似问题