我的数据有列:
| Area_code | ProductID | Stock
Date ----------------------------------
2016-04-01 | 920 | 100000135 2.000
2016-05-01 | 920 | 100000135 4.125
2016-06-01 | 920 | 100000135 7.375
2016-07-01 | 920 | 100000135 7.000
2016-08-01 | 920 | 100000135 4.500
2016-09-01 | 920 | 100000135 2.000
2016-10-01 | 920 | 100000135 6.175
2016-11-01 | 920 | 100000135 4.750
2016-12-01 | 920 | 100000135 2.625
2017-01-01 | 920 | 100000135 1.625
2017-02-01 | 920 | 100000135 4.500
2017-03-01 | 920 | 100000135 4.625
2017-04-01 | 920 | 100000135 1.000
2016-04-01 | 920 | 100000136 0.100
2016-06-01 | 920 | 100000136 0.075
2016-07-01 | 920 | 100000136 0.200
2016-09-01 | 920 | 100000136 0.100
2017-03-01 | 920 | 100000136 0.050
2017-05-01 | 920 | 100000136 0.100
2017-06-01 | 920 | 100000136 0.025
2018-05-01 | 920 | 100000136 0.125
2018-08-01 | 920 | 100000136 0.200
2018-12-01 | 920 | 100000136 0.050
2019-02-01 | 920 | 100000136 0.100
2019-03-01 | 920 | 100000136 0.050数据存在于带有索引"Date“列的Pandas dataframe中。所需的是迭代此数据帧,并仅将具有相同"Area_Code“和"Product_ID”的另一个数据帧(在循环内)中的那些行带来,以获得如下结果:
(例如,在循环的迭代1中,for (920,100000135)对),循环中的dataframe应该返回:
Stock
Date -----
2016-04-01 | 2.000
2016-05-01 | 4.125
.
.
.
2017-04-01 | 1.000(然后,在循环的迭代2中,for (920,100000136)对),循环中的dataframe应该返回:
Stock
Date -----
2016-04-01 | 0.100
2016-06-01 | 0.075
.
.
.
2019-03-01 | 0.050此外,如果我上面生成的数据帧(即作为(Area_code,ProductID)对的结果)的记录数小于12,我希望跳过该迭代,并在下一次迭代中返回这些值。
请对此要求提供帮助。如果有任何不清楚的地方,请告知。非常感谢。
发布于 2020-08-02 16:51:16
我的建议如下
import pandas as pd
df = pd.DataFrame({'Date': ['10/02/2020', '27/01/2020', '27/04/2020', '26/03/2020', '21/02/2020', '07/06/2020',
'12/04/2020'],
'Area_code': [920, 920, 920, 920, 921, 921, 921],
'product_id': [13, 13, 13, 13, 16, 16, 16],
'stok': [1, 2, 3, 4, 6, 7, 8]})
def extract(ac, pi):
#Filter the desired area code and product (e.g., 920, 100000136) pair)
rslt_df = df[(df['Area_code'] == ac) & (df['product_id'] == pi)]
# assign [] if records less than 12, you can delete the list later if it is equal to []
return rslt_df[['Date', 'stok']] if rslt_df.shape[0] > 3 else None
Area_code = [920, 921]
product_id = [13, 16]
append_data=[extract(a, b) for (a, b) in zip(Area_code, product_id)]
#Remove None
all_report = [x for x in append_data if x is not None]https://stackoverflow.com/questions/63213393
复制相似问题