我有以下数据框:
Zone Store Department TTLSales
0 APV 220
1 APV ST12 100
2 APV ST12 Elec 40
3 APV ST12 Grocery 20
4 APV ST12 CPG 40 我希望包含一个列,将这些值连接为:
Zone Store Department TTLSales id
0 APV 220 APV
1 APV ST12 100 APV.ST12
2 APV ST12 Elec 40 APV.ST12.Elec
3 APV ST12 Grocery 20 APV.ST12.Grocery
4 APV ST12 CPG 40 APV.ST12.CPG我是熊猫的新手,已经花了很多时间,但我不能理解。
发布于 2021-07-21 19:05:42
尝试:
#Firstly fill NaN's of the columns:
df[['Zone','Store','Department']]=df[['Zone','Store','Department']].fillna('')
#Finally:
df['id']=(df['Zone']+'.'+df['Store']+'.'+df['Department']).str.rstrip('.')或
如果有超过4列,则使用apply()(性能方面,第一种方法比apply更快):
#Firstly fill NaN's of the columns:
df[['Zone','Store','Department']]=df[['Zone','Store','Department']].fillna('')
#Finally:
df['id'] = df[['Zone','Store','Department']].apply('.'.join, axis=1).str.rstrip('.')发布于 2021-07-21 19:12:21
你可以在这里使用带有str.join的df.agg。
df = df.fillna('')
df['id'] = df[['Zone','Store','Department']].agg('.'.join, axis=1)发布于 2021-07-21 19:47:27
可能工作过度了,但是这里有另一种使用reduce来解决这个问题的方法
from functools import reduce
cols = ['Zone','Store','Department']
f = lambda x,y : (x +'.'+y).str.rstrip(".")
#or# f = lambda x,y : x.str.cat(y,sep='.').str.rstrip(".")
df['id'] = reduce(f,map(df.fillna('').get, cols))print(df)
Zone Store Department TTLSales id
0 APV NaN NaN 220 APV
1 APV ST12 NaN 100 APV.ST12
2 APV ST12 Elec 40 APV.ST12.Elec
3 APV ST12 Grocery 20 APV.ST12.Grocery
4 APV ST12 CPG 40 APV.ST12.CPGhttps://stackoverflow.com/questions/68468319
复制相似问题