首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >Pandas:使用条件语句使用Datetime

Pandas:使用条件语句使用Datetime
EN

Stack Overflow用户
提问于 2020-10-03 23:24:33
回答 2查看 196关注 0票数 0

求求你,我需要帮助我一直在做一个个人项目,结果被卡住了。

简要地说,我有一个名为crime_list的列表,我想用这个列表创建一个关于每个州和省的新列。

我还为指定的时间计划创建了一个datetime,并希望从它自动计算一个新的数据帧。

下面的图像进一步解释了这一点。在这方面我需要指导和帮助。

前两张图解释了我做了什么,而最后一张图解释了我想要的结果(我在excel中手动完成了该操作,但我有一个很大的数据集,手动操作是不可行的)

我的数据输入:

代码语言:javascript
运行
复制
data = pd.read_excel(General, 'Dap')

data.head()

输出:

代码语言:javascript
运行
复制
    State   Station
0   Abia    ekee
1   Imo dal
2   Abuja   lak
3   Kaduna  las
4   Kano    nap``

列表创建和日期时间:

代码语言:javascript
运行
复制
crime = ['Domestic Violence','Murder','Attempted murder','Total Sexual offences','Assault GBH',
     'Common Assault', 'Robbery with aggravating circumstances', 'Common Robbery']

start_date = pd.date_range('1998-04-01', '2019-03-01', freq='MS')
end_date = pd.date_range('1998-04-30', '2019-03-31', freq='M')

输出:

代码语言:javascript
运行
复制
DatetimeIndex(['1998-04-30', '1998-05-31', '1998-06-30', '1998-07-31',
           '1998-08-31', '1998-09-30', '1998-10-31', '1998-11-30',
           '1998-12-31', '1999-01-31',
           ...
           '2018-06-30', '2018-07-31', '2018-08-31', '2018-09-30',
           '2018-10-31', '2018-11-30', '2018-12-31', '2019-01-31',
           '2019-02-28', '2019-03-31'],
          dtype='datetime64[ns]', length=252, freq='M')

我的代码:

代码语言:javascript
运行
复制
for crim in crime_list:
    for stat in data['Station']:
        data['Crime list'] = pd.Series(crim)
        data['Start Date'] = pd.Series(start_date)
        data['End Date'] = pd.Series(end_date)

输出:

代码语言:javascript
运行
复制
        State   Station Crime list  Start Date  End Date
0   Abia    ekee    Shoplifting 1998-04-01  1998-04-30
1   Imo dal NaN 1998-05-01  1998-05-31
2   Abuja   lak NaN 1998-06-01  1998-06-30
3   Kaduna  las NaN 1998-07-01  1998-07-31
4   Kano    nap NaN 1998-08-01  1998-08-31
5   Enugu   nak NaN 1998-09-01  1998-09-30
6   Lagos   laj NaN 1998-10-01  1998-10-31

所需输出:

代码语言:javascript
运行
复制
State   Station crime   start date  end date
0   Abia    ekee    Domestic Violence   1998-04-01  1998-04-30
1   Abia    ekee    Domestic Violence   1998-05-01  1998-05-31
2   Abia    ekee    Domestic Violence   1998-06-01  1998-06-30
3   Abia    ekee    Domestic Violence   1998-07-01  1998-07-31
4   Abia    ekee    Murder  1998-04-01  1998-04-30
EN

回答 2

Stack Overflow用户

发布于 2020-10-06 00:08:50

如果您希望生成一个包含三个数据部分(状态/状态、开始/结束和犯罪类型)的所有组合的数据帧,以便稍后在该帧中填充值,则可以执行以下操作。

不确定这是不是你要找的东西。

您可以按如下方式执行此操作:

代码语言:javascript
运行
复制
import pandas as pd

# if I understood correctly, your read your data from an excel with two
# columns State and Station, you can just replace the following line by
# your read_csv
data= pd.DataFrame(dict(State=['Abia', 'Imo', 'Abuja', 'Kaduna', 'Kano'], Station=['ekee', 'dal', 'lak', 'las', 'nap']))
crime = ['Domestic Violence','Murder','Attempted murder','Total Sexual offences','Assault GBH', 'Common Assault', 'Robbery with aggravating circumstances', 'Common Robbery']

# create a fake dataframe for the crime types
df_crime= pd.DataFrame(dict(crime=crime))

# create a fake dataframe for the start- and enddate
start_date = pd.date_range('1998-04-01', '2019-03-01', freq='MS')
end_date = pd.date_range('1998-04-30', '2019-03-31', freq='M')
df_date= pd.DataFrame(dict(start=start_date, end=end_date))

# add fake join columns to produce a cross-prodct of all three
data['fake_join']= 1
df_crime['fake_join']= 1
df_date['fake_join']= 1

# now merge the three data frames and remove the fake join column
df_result= data.merge(df_date, on='fake_join')
df_result= df_result.merge(df_crime, on='fake_join')
df_result.drop(columns=['fake_join'])

# let's check the output
df_result.sort_values(['start', 'Station']).head(20)

      State Station  ...        end                                   crime
2016    Imo     dal  ... 1998-04-30                       Domestic Violence
2017    Imo     dal  ... 1998-04-30                                  Murder
2018    Imo     dal  ... 1998-04-30                        Attempted murder
2019    Imo     dal  ... 1998-04-30                   Total Sexual offences
2020    Imo     dal  ... 1998-04-30                             Assault GBH
2021    Imo     dal  ... 1998-04-30                          Common Assault
2022    Imo     dal  ... 1998-04-30  Robbery with aggravating circumstances
2023    Imo     dal  ... 1998-04-30                          Common Robbery
0      Abia    ekee  ... 1998-04-30                       Domestic Violence
1      Abia    ekee  ... 1998-04-30                                  Murder
2      Abia    ekee  ... 1998-04-30                        Attempted murder
3      Abia    ekee  ... 1998-04-30                   Total Sexual offences
4      Abia    ekee  ... 1998-04-30                             Assault GBH
5      Abia    ekee  ... 1998-04-30                          Common Assault
6      Abia    ekee  ... 1998-04-30  Robbery with aggravating circumstances
7      Abia    ekee  ... 1998-04-30                          Common Robbery
4032  Abuja     lak  ... 1998-04-30                       Domestic Violence
4033  Abuja     lak  ... 1998-04-30                                  Murder
4034  Abuja     lak  ... 1998-04-30                        Attempted murder
4035  Abuja     lak  ... 1998-04-30                   Total Sexual offences

[20 rows x 6 columns]
票数 0
EN

Stack Overflow用户

发布于 2020-10-06 01:20:17

我认为你的问题需要澄清一下,但从我的想法来看,你想要重复地将犯罪映射到省份和车站,以确定你拥有的日期范围。

所以:州|站点|开始日期|结束日期|犯罪类别|站点在日期范围内重复犯罪列表。我不确定我是否最终使它变得更加复杂,但如果我是对的,那么请查看以下代码片段:

代码语言:javascript
运行
复制
result = []

for (s_index, station) in enumerate(stations):
    for crime in crimes[:len(start_dates)]:
        for (d_index, start_date) in enumerate(start_dates):
            result.append([states[s_index], station, start_date, end_dates[d_index], crime])

ds = pd.DataFrame(result, columns=['State', 'Stations', 'Start Date', 'End Date', 'Crime Category'])
ds.head()
代码语言:javascript
运行
复制
    Province    Stations    Start Date  End Date    Crime Category
Eastern Cape    Aberdeen    2019-01-01  2019-01-31  Domestic Violence
Eastern Cape    Aberdeen    2019-02-01  2019-02-28  Domestic Violence
Eastern Cape    Aberdeen    2019-03-01  2019-03-31  Domestic Violence
Eastern Cape    Aberdeen    2019-04-01  2019-04-30  Domestic Violence
Eastern Cape    Aberdeen    2019-05-01  2019-05-31  Domestic Violence
代码语言:javascript
运行
复制
ds.tail()
代码语言:javascript
运行
复制
    Province    Stations    Start Date  End Date    Crime Category
Western Cape    Wynberg 2019-02-01  2019-02-28  Common Assault
Western Cape    Wynberg 2019-03-01  2019-03-31  Common Assault
Western Cape    Wynberg 2019-04-01  2019-04-30  Common Assault
Western Cape    Wynberg 2019-05-01  2019-05-31  Common Assault
Western Cape    Wynberg 2019-06-01  2019-06-30  Common Assault
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64185834

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档