我想要创建一个基于年名和月名的日期,我编写了一个在控制台中打印日期的代码。
代码:
import pandas as pd
import numpy as np
data = {'month': ['APRIL', 'MAY', 'JUNE', 'JULY', 'AUGUST', 'SEPTEMBER', 'OCTOBER', 'NOVEMBER', 'DECEMBER', 'JANUARY', 'FEBRUARY', 'MARCH'],
'kpi': ['SALES', 'SALES QUANTITY', 'SALES', 'SALES', 'SALES', 'SALES', 'SALES', 'SALES QUANTITY', 'SALES', 'SALES', 'SALES', 'SALES'],
'financial_year': [2022, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023]
}
# Create DataFrame
df = pd.DataFrame(data)
print (pd.to_datetime(df['financial_year'].astype(str) + df['month'], format='%Y%B'))
它生成的输出如下所示:
0 2022-04-01
1 2023-05-01
2 2023-06-01
3 2023-07-01
4 2023-08-01
5 2023-09-01
6 2023-10-01
7 2023-11-01
8 2023-12-01
9 2023-01-01
10 2023-02-01
11 2023-03-01
但是,我希望生成以下格式的输出:
0 2022-04-30
1 2023-05-01
2 2023-06-30
3 2023-07-31
4 2023-08-31
5 2023-09-30
6 2023-10-31
7 2023-11-01
8 2023-12-31
9 2023-01-31
10 2023-02-28
11 2023-03-31
所需产出基于以下一些条件:
如果01
.
sales
,那么date
中的day
应该是那个月份的最后一天,否则就应该是sales
,4月的最后一天应该是闰年。当所有列都是对象类型时,我需要输出。
有人能提出解决这个问题的办法吗?
发布于 2022-11-04 07:09:01
使用kpi
为SALES
筛选行,并为月份的最后一天添加offsets.MonthEnd
:
df['Date'] = pd.to_datetime(df['financial_year'].astype(str) + df['month'], format='%Y%B')
df.loc[df['kpi'].eq('SALES'), 'Date'] += pd.offsets.MonthEnd(0)
print (df)
month kpi financial_year Date
0 APRIL SALES 2022 2022-04-30
1 MAY SALES QUANTITY 2023 2023-05-01
2 JUNE SALES 2023 2023-06-30
3 JULY SALES 2023 2023-07-31
4 AUGUST SALES 2023 2023-08-31
5 SEPTEMBER SALES 2023 2023-09-30
6 OCTOBER SALES 2023 2023-10-31
7 NOVEMBER SALES QUANTITY 2023 2023-11-01
8 DECEMBER SALES 2023 2023-12-31
9 JANUARY SALES 2023 2023-01-31
10 FEBRUARY SALES 2023 2023-02-28
11 MARCH SALES 2023 2023-03-31
https://stackoverflow.com/questions/74273977
复制相似问题