文章/答案/技术大牛

发布

社区首页 >问答首页 >在追加行时，For循环不工作

问在追加行时，For循环不工作
EN

Stack Overflow用户

提问于 2021-10-10 14:19:19

回答 1查看 48关注 0票数 0

我试图循环我的数据帧，并为df.con中的每个元素寻找额外的3行，它只循环了第二个元素US和缺少的UK。

请找到附件中的代码。

import pandas as pd
d = { 'year': [2019,2019,2019,2020,2020,2020], 
      'age group': ['(0-14)','(14-50)','(50+)','(0-14)','(14-50)','(50+)'], 
      'con': ['UK','UK','UK','US','US','US'],
      'population': [10,20,300,400,1000,2000]}
df = pd.DataFrame(data=d)
df2 = df.copy()
df

year    age group   con population
0   2019    (0-14)  UK  10
1   2019    (14-50) UK  20
2   2019    (50+)   UK  300
3   2020    (0-14)  US  400
4   2020    (14-50) US  1000
5   2020    (50+)   US  2000

n_df_2 = df.copy()
con_list = [x for x in df.con]
year_list = [x for x in df.year]
age_list = [x for x in df['age group']]
new_list = ['young vs child','old vs young', 'unemployed vs working']

for country in df.con:

      bev_child =  n_df_2[(n_df_2['con'].str.contains(country)) & (n_df_2['age group'].str.contains(age_list[0]))]
      bev_work =  n_df_2[(n_df_2['con'].str.contains(country)) & (n_df_2['age group'].str.contains(age_list[1]))]
      bev_old =  n_df_2[(n_df_2['con'].str.contains(country)) & (n_df_2['age group'].str.contains(age_list[2]))]


      bev_child.loc[:,'population'] = bev_work.loc[:,'population'].max() / bev_child.loc[:,'population'].max() 
      bev_child.loc[:,'con'] = country +'-'+new_list[0]
      bev_child.loc[:,'age group'] = new_list[0]
      s = n_df_2.append(bev_child, ignore_index=True)


      bev_child.loc[:,'population'] = bev_child.loc[:,'population'].max() + bev_old.loc[:,'population'].max()/ bev_work.loc[:,'population'].max() 
      bev_child.loc[:,'con'] = country +'-'+ new_list[2]
      bev_child.loc[:,'age group'] = new_list[2]

      s = s.append(bev_child, ignore_index=True)

      bev_child.loc[:,'population'] = bev_old.loc[:,'population'].max() / bev_work.loc[:,'population'].max() 
      bev_child.loc[:,'con'] = country +'-'+ new_list[1]
      bev_child.loc[:,'age group'] = new_list[1]

      s = s.append(bev_child, ignore_index=True)
s

输出缺少UK行...

year    age group                   con                     population
0   2019    (0-14)                  UK                      10.0
1   2019    (14-50)                 UK                      20.0
2   2019    (50+)                   UK                      300.0
3   2020    (0-14)                  US                      400.0
4   2020    (14-50)                 US                      1000.0
5   2020    (50+)                   US                      2000.0
6   2020    young vs child          US-young vs child          2.5
7   2020    unemployed vs working   US-unemployed vs working   4.5
8   2020    old vs young             US-old vs young           2.0

pandas

dataframe

loops

for-loop

python

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-10-10 15:15:06

每次通过循环时，s都会重新初始化为以下行上的新数据帧：

s = n_df_2.append(bev_child, ignore_index=True)

这使得s最终成为n_df_2的原始值，加上最后一次执行循环体时附加到它后面的三个值。

我认为这更接近于你想要的(在循环改变之前什么都不做)：

for country in df.con.unique():

    bev_child = n_df_2[(n_df_2['con'].str.contains(country)) & (n_df_2['age group'].str.contains(age_list[0]))]
    bev_work = n_df_2[(n_df_2['con'].str.contains(country)) & (n_df_2['age group'].str.contains(age_list[1]))]
    bev_old = n_df_2[(n_df_2['con'].str.contains(country)) & (n_df_2['age group'].str.contains(age_list[2]))]

    bev_child.loc[:, 'population'] = bev_work.loc[:, 'population'].max() / bev_child.loc[:, 'population'].max()
    bev_child.loc[:, 'con'] = country + '-' + new_list[0]
    bev_child.loc[:, 'age group'] = new_list[0]
    n_df_2 = n_df_2.append(bev_child, ignore_index=True)

    bev_child.loc[:, 'population'] = bev_child.loc[:, 'population'].max() + bev_old.loc[:,
                                                                            'population'].max() / bev_work.loc[:,
                                                                                                  'population'].max()
    bev_child.loc[:, 'con'] = country + '-' + new_list[2]
    bev_child.loc[:, 'age group'] = new_list[2]
    n_df_2 = n_df_2.append(bev_child, ignore_index=True)

    bev_child.loc[:, 'population'] = bev_old.loc[:, 'population'].max() / bev_work.loc[:, 'population'].max()
    bev_child.loc[:, 'con'] = country + '-' + new_list[1]
    bev_child.loc[:, 'age group'] = new_list[1]
    n_df_2 = n_df_2.append(bev_child, ignore_index=True)

print(n_df_2)

输出：

    year              age group                       con  population
0   2019                 (0-14)                        UK        10.0
1   2019                (14-50)                        UK        20.0
2   2019                  (50+)                        UK       300.0
3   2020                 (0-14)                        US       400.0
4   2020                (14-50)                        US      1000.0
5   2020                  (50+)                        US      2000.0
6   2019         young vs child         UK-young vs child         2.0
7   2019  unemployed vs working  UK-unemployed vs working        17.0
8   2019           old vs young           UK-old vs young        15.0
9   2020         young vs child         US-young vs child         2.5
10  2020  unemployed vs working  US-unemployed vs working         4.5
11  2020           old vs young           US-old vs young         2.0

请注意，这只遍历df.con中的唯一值，因此循环体只运行两次。每次循环运行时，都会向输出中添加三条记录。另请注意，输出被附加到n_df_2，因此不需要变量s。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/69515980

复制

相似问题

问在追加行时，For循环不工作
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在追加行时，For循环不工作EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在追加行时，For循环不工作
EN