求求你,我需要帮助我一直在做一个个人项目,结果被卡住了。
简要地说,我有一个名为crime_list的列表,我想用这个列表创建一个关于每个州和省的新列。
我还为指定的时间计划创建了一个datetime,并希望从它自动计算一个新的数据帧。
下面的图像进一步解释了这一点。在这方面我需要指导和帮助。
前两张图解释了我做了什么,而最后一张图解释了我想要的结果(我在excel中手动完成了该操作,但我有一个很大的数据集,手动操作是不可行的)
我的数据输入:
data = pd.read_excel(General, 'Dap')data.head()
输出:
State Station
0 Abia ekee
1 Imo dal
2 Abuja lak
3 Kaduna las
4 Kano nap``列表创建和日期时间:
crime = ['Domestic Violence','Murder','Attempted murder','Total Sexual offences','Assault GBH',
'Common Assault', 'Robbery with aggravating circumstances', 'Common Robbery']
start_date = pd.date_range('1998-04-01', '2019-03-01', freq='MS')
end_date = pd.date_range('1998-04-30', '2019-03-31', freq='M')输出:
DatetimeIndex(['1998-04-30', '1998-05-31', '1998-06-30', '1998-07-31',
'1998-08-31', '1998-09-30', '1998-10-31', '1998-11-30',
'1998-12-31', '1999-01-31',
...
'2018-06-30', '2018-07-31', '2018-08-31', '2018-09-30',
'2018-10-31', '2018-11-30', '2018-12-31', '2019-01-31',
'2019-02-28', '2019-03-31'],
dtype='datetime64[ns]', length=252, freq='M')我的代码:
for crim in crime_list:
for stat in data['Station']:
data['Crime list'] = pd.Series(crim)
data['Start Date'] = pd.Series(start_date)
data['End Date'] = pd.Series(end_date)输出:
State Station Crime list Start Date End Date
0 Abia ekee Shoplifting 1998-04-01 1998-04-30
1 Imo dal NaN 1998-05-01 1998-05-31
2 Abuja lak NaN 1998-06-01 1998-06-30
3 Kaduna las NaN 1998-07-01 1998-07-31
4 Kano nap NaN 1998-08-01 1998-08-31
5 Enugu nak NaN 1998-09-01 1998-09-30
6 Lagos laj NaN 1998-10-01 1998-10-31所需输出:
State Station crime start date end date
0 Abia ekee Domestic Violence 1998-04-01 1998-04-30
1 Abia ekee Domestic Violence 1998-05-01 1998-05-31
2 Abia ekee Domestic Violence 1998-06-01 1998-06-30
3 Abia ekee Domestic Violence 1998-07-01 1998-07-31
4 Abia ekee Murder 1998-04-01 1998-04-30发布于 2020-10-06 01:20:17
我认为你的问题需要澄清一下,但从我的想法来看,你想要重复地将犯罪映射到省份和车站,以确定你拥有的日期范围。
所以:州|站点|开始日期|结束日期|犯罪类别|站点在日期范围内重复犯罪列表。我不确定我是否最终使它变得更加复杂,但如果我是对的,那么请查看以下代码片段:
result = []
for (s_index, station) in enumerate(stations):
for crime in crimes[:len(start_dates)]:
for (d_index, start_date) in enumerate(start_dates):
result.append([states[s_index], station, start_date, end_dates[d_index], crime])
ds = pd.DataFrame(result, columns=['State', 'Stations', 'Start Date', 'End Date', 'Crime Category'])
ds.head() Province Stations Start Date End Date Crime Category
Eastern Cape Aberdeen 2019-01-01 2019-01-31 Domestic Violence
Eastern Cape Aberdeen 2019-02-01 2019-02-28 Domestic Violence
Eastern Cape Aberdeen 2019-03-01 2019-03-31 Domestic Violence
Eastern Cape Aberdeen 2019-04-01 2019-04-30 Domestic Violence
Eastern Cape Aberdeen 2019-05-01 2019-05-31 Domestic Violenceds.tail() Province Stations Start Date End Date Crime Category
Western Cape Wynberg 2019-02-01 2019-02-28 Common Assault
Western Cape Wynberg 2019-03-01 2019-03-31 Common Assault
Western Cape Wynberg 2019-04-01 2019-04-30 Common Assault
Western Cape Wynberg 2019-05-01 2019-05-31 Common Assault
Western Cape Wynberg 2019-06-01 2019-06-30 Common Assaulthttps://stackoverflow.com/questions/64185834
复制相似问题