首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >Python Pandas Dataframe动态添加列

Python Pandas Dataframe动态添加列
EN

Stack Overflow用户
提问于 2018-06-01 04:13:33
回答 1查看 398关注 0票数 0

我的数据框如下:

代码语言:javascript
复制
    Date        Country     GDP
0   2011  United States   345.0
1   2012  United States     0.0
2   2013  United States   457.0
3   2014  United States   577.0
4   2015  United States     0.0
5   2016  United States   657.0
6   2011             UK    35.0
7   2012             UK    64.0
8   2013             UK    54.0
9   2014             UK    67.0
10  2015             UK   687.0
11  2016             UK     0.0
12  2011          China    34.0
13  2012          China    54.0
14  2013          China   678.0
15  2014          China   355.0
16  2015          China  5678.0
17  2016          China   345.0

我想计算一个国家每年在所有三个国家中的GDP百分比是多少。我想在dataframe中再添加一个名为parc的列:

我实现了以下代码:

代码语言:javascript
复制
import pandas as pd
countrylist=['United States','UK','China']
for country in countrylist:
    for year in range (2011,2016):      
        df['perc']=(df['GDP'][(df['Country']==country) & (df['Date']==year)]).astype(float)/df['GDP'][df['Date']==year].sum()
        print (df['perc'])

我的输出如下

代码语言:javascript
复制
    0     0.833333
    1          NaN
    2          NaN
    3          NaN
    4          NaN
    5          NaN
    6          NaN
    7          NaN
    8          NaN
    9          NaN
    10         NaN
    11         NaN
    12         NaN
    13         NaN
    14         NaN
    15         NaN
    16         NaN
    17         NaN
    0     NaN
    1     0.0
    2     NaN
    3     NaN
    4     NaN
    5     NaN
    6     NaN
    7     NaN
    8     NaN
    9     NaN
    10    NaN
    11    NaN
    12    NaN
    13    NaN
    14    NaN
    15    NaN
    16    NaN
    17    NaN
0          NaN
1          NaN
2     0.384357
3          NaN
4          NaN
5          NaN
6          NaN
7          NaN
8          NaN
9          NaN
10         NaN
11         NaN
12         NaN
13         NaN
14         NaN
15         NaN
16         NaN
17         NaN

……

我注意到,当新的循环开始时,我之前的结果被抹去了。所以最终我只有最后一个perc的值。当df‘’perc‘发生时,我应该提供一些位置信息,例如:

代码语言:javascript
复制
df['perc'][([(df['Country']==country) & (df['Date']==year)])]=(df['GDP'][(df['Country']==country) & (df['Date']==year)]).astype(float)/df['GDP'][df['Date']==year].sum()

但它不起作用。如何动态插入值?

理想情况下,我应该:

代码语言:javascript
复制
    Date        Country     GDP    perc
0   2011  United States   345.0    0.81
1   2012  United States     0.0    0.0
2   2013  United States   457.0    0.23
3   2014  United States   577.0    xx
4   2015  United States     0.0    xx
5   2016  United States   657.0    xx
6   2011             UK    35.0    xx
7   2012             UK    64.0    xx
8   2013             UK    54.0    xx
9   2014             UK    67.0    xx
10  2015             UK   687.0    xx
11  2016             UK     0.0    xx
12  2011          China    34.0    xx
13  2012          China    54.0    xx
14  2013          China   678.0    xx
15  2014          China   355.0    xx
16  2015          China  5678.0    xx
17  2016          China   345.0    xx
EN

回答 1

Stack Overflow用户

发布于 2018-06-01 04:18:46

您可以在此处使用transform sum

代码语言:javascript
复制
df.GDP/df.groupby('Date').GDP.transform('sum')
Out[161]: 
0     0.833333
1     0.000000
2     0.384357
3     0.577578
4     0.000000
5     0.655689
6     0.084541
7     0.542373
8     0.045416
9     0.067067
10    0.107934
11    0.000000
12    0.082126
13    0.457627
14    0.570227
15    0.355355
16    0.892066
17    0.344311
Name: GDP, dtype: float64
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/50632087

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档