将pandas作为pd导入numpy作为np导入ast
pd.options.display.max_columns = 20
我的数据框专栏季节看起来像这样(前20个条目):
season
0 2006-07
1 2007-08
2 2008-09
3 2009-10
4 2010-11
5 2011-12
6 2012-13
7 2013-14
8 2014-15
9 2015-16
10 2016-17
11 2017-18
12 2018-19
13 Career
14 season
15 2018-19
16 Career
17 season
18 2017-18
19 2018-19
它以赛季开始,以职业生涯结束。我想用从1开始到职业生涯结束的数字来代替年份。我想是这样的:
season
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
10 11
11 12
12 13
13 Career
14 season
15 1
16 Career
17 season
18 1
19 2
所以每次有赛季的时候,计数都应该重置,每次职业生涯结束的时候,计数都应该重置。
发布于 2019-05-28 15:26:28
使用计数器的GroupBy.cumcount
将Series.isin
创建的掩码与移位值进行比较,从而创建连续的组:
s = df['season'].isin(['Career', 'season'])
df['new'] = np.where(s, df['season'], df.groupby(s.ne(s.shift()).cumsum()).cumcount() + 1)
print (df)
season new
0 2006-07 1
1 2007-08 2
2 2008-09 3
3 2009-10 4
4 2010-11 5
5 2011-12 6
6 2012-13 7
7 2013-14 8
8 2014-15 9
9 2015-16 10
10 2016-17 11
11 2017-18 12
12 2018-19 13
13 Career Career
14 season season
15 2018-19 1
16 Career Career
17 season season
18 2017-18 1
19 2018-19 2
对于replace列season
s = df['season'].isin(['Career', 'season'])
df.loc[~s, 'season'] = df.groupby(s.ne(s.shift()).cumsum()).cumcount() + 1
print (df)
season
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
10 11
11 12
12 13
13 Career
14 season
15 1
16 Career
17 season
18 1
19 2
https://stackoverflow.com/questions/56336934
复制相似问题