背景
任务
如何使用Pandas将这些共享时间戳的行组合成一行?
当使用merge
函数合并两个数据文件时,重叠的列分别保存,后缀(_x
和_y
)将它们区分开来。
示例
这是我得到的密码:
# this is an example of the code I have but it isn't working properly
from datetime import datetime
import pandas as pd
csv_data = 'example.csv'
timenow = datetime.now()
# reads dataframe from example.csv
historical_df = pd.read_csv(csv_data)
historical_df.set_index('timestamp', inplace=True)
# gets new data from API ### 1st API call
new_data = pd.DataFrame([timenow, 1234]).T
new_data.columns = ['timestamp', 'col 1']
new_data.set_index('timestamp', inplace=True)
# Concat current data to historical DF and dump to excel
updated_df = pd.concat([historical_df, new_data])
# Save to CSV
updated_df.to_csv(csv_data)
### 2nd API call with the same time
# reads dataframe from example.csv
historical_df = pd.read_csv(csv_data)
historical_df.set_index('timestamp', inplace=True)
# gets new data from API
new_data = pd.DataFrame([timenow, 5678]).T
new_data.columns = ['timestamp', 'col 2']
new_data.set_index('timestamp', inplace=True)
# Concat current data to historical DF and dump to excel
updated_df = pd.concat([historical_df, new_data])
# Save to CSV
updated_df.to_csv(csv_data)
“example.csv”的内容:
timestamp col 1 col 2
9/01/2018 12:15 3610 2420.29
期望输出的示例:
timestamp col 1 col 2
9/01/2018 12:15 3610 2420.29
<the new timestamp> 1234 5678
使用concat
的结果是:
timestamp col 1 col 2
9/01/2018 12:15 3610 2420.29
<the new timestamp> 1234
<the new timestamp> 5678
使用merge
的结果是:
timestamp col 1_x col 1_y col 2_x col 2_y
9/01/2018 12:15 3610 2420.29
<the new timestamp> 1234 5678
请注意,此示例仅显示两个非时间戳列,但在给出的示例中实际上有15列。
发布于 2018-01-09 08:46:24
正如COLDSPEED在评论中提到的那样,答案是使用combine_first
而不是merge
df_with_merged_rows = df_with_new_data.combine_first(df_created_from_csv)
https://stackoverflow.com/questions/48172146
复制相似问题