我有以下具有形状的数据集:(118, 2)
我想要对数据进行子集。我在这里的目标是以这样一种方式对数据进行子集,这样我就不必重复以下内容:
removeTotal[['Firms', 'IndustrySsize']][:8]
removeTotal[['Firms', 'IndustrySsize']][8:16]
removeTotal[['Firms', 'IndustrySsize']][24:32]
removeTotal[['Firms', 'IndustrySsize']][32:40]
removeTotal[['Firms', 'IndustrySsize']][40:48]
removeTotal[['Firms', 'IndustrySsize']][48:56]
removeTotal[['Firms', 'IndustrySsize']][56:64]
也就是说,我想用n
或类似的东西替换上面语法中的8,16,24等数字。
Firms IndustrySsize
1 3598185 0-4
2 998953 5-9
3 608502 10-19
4 5205640 0-19
5 513179 20-99
6 87563 100-499
7 5806382 0-499
8 19076 500
10 3575290 0-4
11 992281 5-9
12 600551 10-19
13 5168122 0-19
14 503033 20-99
15 85264 100-499
16 5756419 0-499
17 18636 500
19 3532058 0-4
20 978993 5-9
21 592963 10-19
22 5104014 0-19
23 481496 20-99
24 81243 100-499
25 5666753 0-499
26 17671 500
28 3575240 0-4
29 968075 5-9
30 617089 10-19
31 5160404 0-19
32 475125 20-99
33 81773 100-499
... ... ...
99 85304 100-499
100 5640407 0-499
101 17367 500
103 726862 0
104 2669870 1-4
105 1021210 5-9
106 617087 10-19
107 5035029 0-19
108 515977 20-99
109 84385 100-499
110 5635391 0-499
111 17153 500
113 709074 0
114 2680087 1-4
115 1012954 5-9
116 605693 10-19
117 5007808 0-19
118 501848 20-99
119 81347 100-499
120 5591003 0-499
121 16740 500
123 711899 0
124 2664452 1-4
125 1011849 5-9
126 600167 10-19
127 4988367 0-19
128 494357 20-99
129 80075 100-499
130 5562799 0-499
131 16378 500
发布于 2018-06-18 06:33:27
只需使用numpy.split
In [59]: np.split(df, range(8, df.shape[0], 8))
Out[59]:
[ Firms IndustrySsize
index
1 3598185 0-4
2 998953 5-9
3 608502 10-19
4 5205640 0-19
5 513179 20-99
6 87563 100-499
7 5806382 0-499
8 19076 500, Firms IndustrySsize
index
10 3575290 0-4
11 992281 5-9
12 600551 10-19
13 5168122 0-19
14 503033 20-99
15 85264 100-499
16 5756419 0-499
17 18636 500, Firms IndustrySsize
index
19 3532058 0-4
20 978993 5-9
21 592963 10-19
22 5104014 0-19
23 481496 20-99
24 81243 100-499
25 5666753 0-499
26 17671 500, Firms IndustrySsize
index
28 3575240 0-4
29 968075 5-9
30 617089 10-19
31 5160404 0-19
32 475125 20-99
33 81773 100-499]
range(8, df.shape[0], 8)
允许我计算你提到的步骤(8,16,...),直到你的DataFrame结束。
https://stackoverflow.com/questions/50897051
复制相似问题