首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >如何跟踪每个账户过去6个月收到的总交易金额?

如何跟踪每个账户过去6个月收到的总交易金额?
EN

Stack Overflow用户
提问于 2020-09-02 14:10:40
回答 1查看 161关注 0票数 0

这是我的交易数据。它显示了从from列中的帐户到to列中的accounts与日期和金额信息之间的事务。

代码语言:javascript
运行
复制
data 

id          from    to          date        amount  
<int>       <fctr>  <fctr>      <date>      <dbl>
19521       6644    6934        2005-01-01  700.0
19524       6753    8456        2005-01-01  600.0
19523       9242    9333        2005-01-01  1000.0
…           …       …           …           …
1056317     7819    7454        2010-12-31  60.2
1056318     6164    7497        2010-12-31  107.5
1056319     7533    7492        2010-12-31  164.1

现在,我要做的是:逐行检查每个事务,对于from列中的每个帐户,我希望跟踪在进行特定事务时他们上个月收到的事务金额,并希望将此信息保存为新列(因此,这个新列将描述from列中帐户在交易日期之前的最后六个月收到的总事务金额)。

例如:

在第一行数据中,对于account 6644,我应该查看to列,如果6644"2004-07-05"-"2005-01-01"之间得到了一个事务,这是一个6个月的期间,直到6644完成事务-we的日期2005-01-01。如果6644收到了这样的事务,我应该对它们进行汇总,并将这些和信息作为值添加到新列total_trx_amount_received_in_last_6month中。类似地,我应该对account 6753做同样的操作,并查找它在"2004-07-05"-"2005-01-01"日期之间得到的事务,并将它们加起来,并将其添加到total_trx_amount_received_in_last_6month列中。而且,我应该以这种方式在数据中逐行继续。

那么,我如何才能对整个数据做到这一点呢?

PS:在日期间隔"2004-07-05"-"2005-01-01"中,"2005-01-01"是事务日期,以获得第二个日期"2004-07-05" I减去180天(大约。(6个月)从交易日期"2005-01-01"开始。

为了更好地了解这一点,我提供了以下数据:

我还将展示输出的情况。假设我们只有这么多的交易。这里只需考虑5370帐户,因为其他帐户8605,6390,8934在这里不接收任何事务。

代码语言:javascript
运行
复制
id          from    to          date        amount  total_trx_amount_received_in_last_6month 
<int>       <fctr>  <fctr>      <date>      <dbl>    <dbl>
18529       5370    9356        2005-05-31  24.4     0.0
13742       5370    5605        2005-08-05  7618.0   0.0
9913        5370    8567        2005-09-12  21971.0  0.0
956         8605    5370        2005-10-05  5245.0   0.0
2557        5370    5636        2005-11-12  2921.0   5245.0    
1602        6390    5370        2005-11-26  8000.0   0.0
18669       5370    8933        2005-11-30  169.2    (5245.0+8000.0)=13245
35900       5370    8483        2006-01-31  71.5     (5245.0+8000.0)=13245
48667       8934    5370        2006-03-31  14.6     0.0
51341       5370    7626        2006-04-11  4214.0   (8000.0+14.6)=8014.6

下面是我所做的:首先要注意,上面这个小数据是按照date的升序排序的。

在第一行中,对于from column中的account from column,我查看过去的数据,以查看5370是否在日期"2004-12-02"-"2005-05-31"之间接收到任何事务。由于第一行是第一个事务,显然5370在日期"2005-05-31"之前没有收到任何事务,所以我分别将0.0登录到total_trx_amount_received_in_last_6month列中。在第二行,对于from column中的account from column5370在日期"2005-02-06"-"2005-08-05"之间也没有收到任何事务,所以我将0.0登录到total_trx_amount_received_in_last_6month列中。类似地,我将0.0分别记录在帐户53708605的第3行和第4行。在第5行中,对于from column中的account from column5370在日期"2005-05-16"-"2005-11-12"之间接收了一个事务,该事务在"2005-10-05"(在数据的第4行)中接收,其金额为5245.0,因此我将5245.0登录到total_trx_amount_received_in_last_6month列中。在第6行中,对于from column中的account from column6390在日期"2005-05-30"-"2005-11-26"之间没有收到任何事务,所以我将0.0登录到total_trx_amount_received_in_last_6month列中。所有的数据行都是这样的。

dput()输出数据:

代码语言:javascript
运行
复制
structure(list(id = c(18529L, 13742L, 9913L, 956L, 2557L, 1602L, 
18669L, 35900L, 48667L, 51341L, 53713L, 60126L, 60545L, 65113L, 
66783L, 83324L, 87614L, 88898L, 89874L, 94765L, 100277L, 101587L, 
103444L, 108414L, 113319L, 121516L, 126607L, 130170L, 131771L, 
135002L, 149431L, 157403L, 157645L, 158831L, 162597L, 162680L, 
163901L, 165044L, 167082L, 168562L, 168940L, 172578L, 173031L, 
173267L, 177507L, 179167L, 182612L, 183499L, 188171L, 189625L, 
193940L, 198764L, 199342L, 200134L, 203328L, 203763L, 204733L, 
205651L, 209672L, 210242L, 210979L, 214532L, 214741L, 215738L, 
216709L, 220828L, 222140L, 222905L, 226133L, 226527L, 227160L, 
228193L, 231782L, 232454L, 233774L, 237836L, 237837L, 238860L, 
240223L, 245032L, 246673L, 247561L, 251611L, 251696L, 252663L, 
254410L, 255126L, 255230L, 258484L, 258485L, 259309L, 259910L, 
260542L, 262091L, 264462L, 264887L, 264888L, 266125L, 268574L, 
272959L), from = c("5370", "5370", "5370", "8605", "5370", "6390", 
"5370", "5370", "8934", "5370", "5635", "6046", "5680", "8026", 
"9037", "5370", "7816", "8046", "5492", "8756", "5370", "9254", 
"5370", "5370", "7078", "6615", "5370", "9817", "8228", "8822", 
"5735", "7058", "5370", "8667", "9315", "6053", "7990", "8247", 
"8165", "5656", "9261", "5929", "8251", "5370", "6725", "5370", 
"6004", "7022", "7442", "5370", "8679", "6491", "7078", "5370", 
"5370", "5370", "5658", "5370", "9296", "8386", "5370", "5370", 
"5370", "9535", "5370", "7541", "5370", "9621", "5370", "7158", 
"8240", "5370", "5370", "8025", "5370", "5370", "5370", "6989", 
"5370", "7059", "5370", "5370", "5370", "9121", "5608", "5370", 
"5370", "7551", "5370", "5370", "5370", "5370", "9163", "9362", 
"6072", "5370", "5370", "5370", "5370", "5370"), to = c("9356", 
"5605", "8567", "5370", "5636", "5370", "8933", "8483", "5370", 
"7626", "5370", "5370", "5370", "5370", "5370", "9676", "5370", 
"5370", "5370", "5370", "9105", "5370", "9772", "6979", "5370", 
"5370", "7564", "5370", "5370", "5370", "5370", "5370", "8744", 
"5370", "5370", "5370", "5370", "5370", "5370", "5370", "5370", 
"5370", "5370", "7318", "5370", "8433", "5370", "5370", "5370", 
"7122", "5370", "5370", "5370", "8566", "6728", "9689", "5370", 
"8342", "5370", "5370", "5614", "5596", "5953", "5370", "7336", 
"5370", "7247", "5370", "7291", "5370", "5370", "6282", "7236", 
"5370", "8866", "8613", "9247", "5370", "6767", "5370", "9273", 
"7320", "9533", "5370", "5370", "8930", "9343", "5370", "9499", 
"7693", "7830", "5392", "5370", "5370", "5370", "7497", "8516", 
"9023", "7310", "8939"), date = structure(c(12934, 13000, 13038, 
13061, 13099, 13113, 13117, 13179, 13238, 13249, 13268, 13296, 
13299, 13309, 13314, 13391, 13400, 13404, 13409, 13428, 13452, 
13452, 13460, 13482, 13493, 13518, 13526, 13537, 13542, 13544, 
13596, 13616, 13617, 13626, 13633, 13633, 13639, 13642, 13646, 
13656, 13660, 13664, 13667, 13669, 13677, 13686, 13694, 13694, 
13707, 13716, 13725, 13738, 13739, 13746, 13756, 13756, 13756, 
13761, 13769, 13770, 13776, 13786, 13786, 13786, 13791, 13799, 
13806, 13813, 13817, 13817, 13817, 13822, 13829, 13830, 13836, 
13847, 13847, 13847, 13852, 13860, 13866, 13871, 13878, 13878, 
13878, 13882, 13883, 13883, 13887, 13887, 13888, 13889, 13890, 
13891, 13895, 13896, 13896, 13899, 13905, 13909), class = "Date"), 
    amount = c(24.4, 7618, 21971, 5245, 2921, 8000, 169.2, 71.5, 
    14.6, 4214, 14.6, 13920, 14.6, 24640, 1600, 261.1, 16400, 
    3500, 2700, 19882, 182, 14.6, 16927, 25653, 3059, 2880, 9658, 
    4500, 12480, 14.6, 1000, 3679, 34430, 12600, 14.6, 19.2, 
    4900, 826, 3679, 2100, 38000, 79, 11400, 21495, 3679, 200, 
    14.6, 100.6, 3679, 5300, 108.9, 3679, 2696, 7500, 171.6, 
    14.6, 99.2, 2452, 3679, 3218, 700, 69.7, 14.6, 91.5, 2452, 
    3679, 2900, 17572, 14.6, 14.6, 90.5, 2452, 49752, 3679, 1900, 
    14.6, 870, 85.2, 2452, 3679, 1600, 540, 14.6, 14.6, 79, 210, 
    2452, 28400, 720, 180, 420, 44289, 489, 3679, 840, 2900, 
    150, 870, 420, 14.6)), row.names = c(NA, -100L), class = "data.frame")

(我将fromto列转换为字符,因为它们有大量的级别,否则输出将占用很大的空间)

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-09-05 04:12:03

我们可以使用map2_dblsum of amount,这是在6个月的范围内.

代码语言:javascript
运行
复制
library(dplyr)
library(purrr)

data %>% 
    mutate(amt = map2_dbl(from, date,
                ~sum(amount[to == .x & between(date, .y - 180, .y)])))
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/63707404

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档