我是熊猫的新手,在做了一些研究/经历了一些试验和errors...appreciate之后,我很难得到我想要的结果,谢谢!
假设我有一个包含一些贸易数据的数据,为了更容易地解释我想做什么,我自己编了这些数据:
| time | symbol | market | volume |
| ---------- | -------- | ------- | ------ |
| 2022-07-01 | btcusdt | Binance | 300 |
| 2022-07-01 | btcusdt | Gemini | 200 |
| 2022-07-01 | btcusdt | Huobi | 100 |
| 2022-07-02 | btcusdt | Binance | 400 |
| 2022-07-02 | btcusdt | Gemini | 350 |
| 2022-07-02 | btcusdt | Huobi | 100 |
| 2022-07-01 | ethusdt | Binance | 1000 |
| 2022-07-01 | ethusdt | Gemini | 2000 |
| 2022-07-02 | ethusdt | Binance | 1500 |
| 2022-07-02 | ethusdt | Gemini | 500 |
| 2022-07-01 | usdcusdt | Binance | 300 |
| 2022-07-01 | usdcusdt | Gemini | 1000 |
| 2022-07-01 | usdcusdt | Huobi | 2000 |
| 2022-07-02 | usdcusdt | Binance | 1500 |
| 2022-07-02 | usdcusdt | Gemini | 1500 |
| 2022-07-02 | usdcusdt | Huobi | 1500 |我想在原始数据中创建一个列"prod_vol“,它进行以下计算:
prod_vol = volume / total_volume_trade_on_that_exchange要获得每天的百分比,示例如下所示:
| time | symbol | market | volume | prod_vol |
| ---------- | -------- | ------- | ------ | ----------------------------------- |
| 2022-07-01 | btcusdt | Binance | 300 | = 300/(300+1000+300) * 100 = 18.75% |
| 2022-07-01 | btcusdt | Gemini | 200 | = 200/(200+2000+1000) * 100 = 6.25% |
| 2022-07-01 | btcusdt | Huobi | 100 | = 100/(100+2000) = 8.33% |
| 2022-07-02 | btcusdt | Binance | 400 | = 400/(400+1500+1500) = 11.76% |
| 2022-07-02 | btcusdt | Gemini | 350 | |
| 2022-07-02 | btcusdt | Huobi | 100 | |
| 2022-07-01 | ethusdt | Binance | 1000 | |
| 2022-07-01 | ethusdt | Gemini | 2000 | |
| 2022-07-02 | ethusdt | Binance | 1500 | |
| 2022-07-02 | ethusdt | Gemini | 500 | |
| 2022-07-01 | usdcusdt | Binance | 300 | |
| 2022-07-01 | usdcusdt | Gemini | 1000 | |
| 2022-07-01 | usdcusdt | Huobi | 2000 | |
| 2022-07-02 | usdcusdt | Binance | 1500 | |
| 2022-07-02 | usdcusdt | Gemini | 1500 | |
| 2022-07-02 | usdcusdt | Huobi | 1500 | |通过使用GroupBy和聚合函数创建一个新的数据框架,我能够获得分母(即每天不同交易所的总交易量),如下所示:
df_new = df.groupby(['time', 'market']).agg(volume=('volume', 'sum'))这给了我一个汇总表:
| time | market | volume |
| ---------- | ----- | -------------------------- |
| 2022-07-01 | Binance | = 300 + 1000 + 300 = 1600 |
| | Gemini | = 200 + 2000 + 1000 = 3200 |
| | Huobi | = 100 + 2000 = 2100 |
| 2022-07-02 | Binance | = 400 + 1500 + 1500 = 3400 |
| | Gemini | = 350 + 500 + 1500 = 2350 |
| | Huobi | = 100 + 1500 = 1600 |我很难使用来自新df_new的这些数据,并对原始df执行计算,以获得预期的输出,如第二个表所示.
有人知道我是怎么做到的吗?谢谢!
编辑:
print(df.to_dict())产出:
{'time': {0: '2022-07-01', 1: '2022-07-01', 2: '2022-07-01', 3: '2022-07-02', 4: '2022-07-02', 5: '2022-07-02', 6: '2022-07-01', 7: '2022-07-01', 8: '2022-07-02', 9: '2022-07-02', 10: '2022-07-01', 11: '2022-07-01', 12: '2022-07-01', 13: '2022-07-02', 14: '2022-07-02', 15: '2022-07-02'}, 'symbol': {0: 'btcusdt', 1: 'btcusdt', 2: 'btcusdt', 3: 'btcusdt', 4: 'btcusdt', 5: 'btcusdt', 6: 'ethusdt', 7: 'ethusdt', 8: 'ethusdt', 9: 'ethusdt', 10: 'usdcusdt', 11: 'usdcusdt', 12: 'usdcusdt', 13: 'usdcusdt', 14: 'usdcusdt', 15: 'usdcusdt'}, 'market': {0: 'Binance', 1: 'Gemini', 2: 'Huobi', 3: 'Binance', 4: 'Gemini', 5: 'Huobi', 6: 'Binance', 7: 'Gemini', 8: 'Binance', 9: 'Gemini', 10: 'Binance', 11: 'Gemini', 12: 'Huobi', 13: 'Binance', 14: 'Gemini', 15: 'Huobi'}, 'volume': {0: 300, 1: 200, 2: 100, 3: 400, 4: 350, 5: 100, 6: 1000, 7: 2000, 8: 1500, 9: 500, 10: 300, 11: 1000, 12: 2000, 13: 1500, 14: 1500, 15: 1500}}发布于 2022-07-16 21:00:14
IIUC
这就是你要找的吗?
df['mkt_vol'] = df.groupby(['market','time'])['volume'].transform('sum')
df['prod_vol'] = round(df['volume']/df['mkt_vol'] * 100, 2)
df或者,就一条线
df['prod_vol'] = round(df['volume']/df.groupby(['market','time'])['volume'].transform('sum') *100,2)
df time symbol market volume prod_vol
0 2022-07-01 btcusdt Binance 300 18.75
1 2022-07-01 btcusdt Gemini 200 6.25
2 2022-07-01 btcusdt Huobi 100 4.76
3 2022-07-02 btcusdt Binance 400 11.76
4 2022-07-02 btcusdt Gemini 350 14.89
5 2022-07-02 btcusdt Huobi 100 6.25
6 2022-07-01 ethusdt Binance 1000 62.50
7 2022-07-01 ethusdt Gemini 2000 62.50
8 2022-07-02 ethusdt Binance 1500 44.12
9 2022-07-02 ethusdt Gemini 500 21.28
10 2022-07-01 usdcusdt Binance 300 18.75
11 2022-07-01 usdcusdt Gemini 1000 31.25
12 2022-07-01 usdcusdt Huobi 2000 95.24
13 2022-07-02 usdcusdt Binance 1500 44.12
14 2022-07-02 usdcusdt Gemini 1500 63.83
15 2022-07-02 usdcusdt Huobi 1500 93.75https://stackoverflow.com/questions/73007115
复制相似问题