我有两个数据文件:df_session
和df_session_focus
。df_session
有['user_id', 'group', 'start_time', 'end_time']
列;df_session_focus
有['user_id', 'group', 'focus_leave', 'focus_enter']
列。
df_session
user_id群start_time end_time
13ai0控制2020-01-21 20:39:10+00:00 2020-01-21 21:00:02+00:00
13ai0控制2020-01-22 13:42:31+00:00 2020-01-22 14:12:31+00:00
13ai0控制2020-01-22 14:13:27+00:00 2020-01-22 14:43:27+00:00
13ai0控制2020-01-23 00:13:30+00:00 2020-01-23 00:43:30+00:00
我想用来自focus_enter
的值替换end_time
中的NaNs。
我尝试了现有的解决方案,但对我没有任何作用:
1.
d2 = df_session.set_index(cols).end_time.dropna()
df_session_focus.fillna(df_session_focus.focus_enter.join(d2, on=cols))
错误:AttributeError: 'Series' object has no attribute 'join'
2.
mapping = df_session.set_index("session_id")
df_session_focus["focus_enter"] = df_session_focus.focus_enter.map(d2['end_time'])
错误:KeyError: 'end_time'
3.
df_session_focus.merge(df_session[["session_id", 'end_time']],left_on="end_time", right_on="end_time", how="left")
错误:KeyError: 'end_time'
发布于 2022-04-14 06:15:24
这段代码适用于我:
df = pd.merge(left=df_session_focus, right=df_session,
on=['user_id', 'session_id'], how='left')
df['focus_enter'] = df['focus_enter'].fillna(df['end_time'])
finalresult = df.drop(['group_y', 'Unnamed: 0_x', 'Unnamed: 0_y'], axis=1)
finalresult.rename(columns={'group_x':'group'}, inplace=True)
https://stackoverflow.com/questions/71494221
复制相似问题