我想将df1
中的原始数据修改为df2
格式
import pandas as pd
df1=pd.DataFrame([["20180105","abcdefg"],["","sdasdas"],["20180211","asdasfsd"],["","asdfg"],["","sdada"]],columns=["A","B"])
df2=pd.DataFrame([["20180105","abcdefgsdasdas"],["20180211","asdasfsdasdfgsdada"]],columns=["A","B"])
发布于 2018-08-01 07:21:44
您可以使用groupby
,并使用sum
进行字符串连接:
df1.replace({'A':{'':np.nan}}).ffill().groupby('A', as_index=False).sum()
A B
0 20180105 abcdefgsdasdas
1 20180211 asdasfsdasdfgsdada
注意,我删除了列A
中的空字符串,将其替换为NaN
,然后将其向前填充为ffill()
发布于 2018-08-01 07:32:40
也可以使用agg
+ ''.join
g = (df1.A != '').cumsum()
df1.groupby(g, as_index=False).agg(''.join)
A B
0 20180105 abcdefgsdasdas
1 20180211 asdasfsdasdfgsdada
https://stackoverflow.com/questions/51623353
复制相似问题