我有一个按日期计算客户费用的数据集。我想要最后三个月的费用和费用。根据每一位客户的访问清单来支付费用。我怎样才能在R中做到这一点?
下面是数据集
library(tidyverse)
library(lubridate)
name <- c('Mary','Sue','Peter','Mary','Mary','John','Sue',
'Peter','Peter','John','John','John','Mary','Mary',
'John','Mary','Peter','Sue')
date <- c('01/04/2018','03/02/2017','01/01/2019','24/04/2017',
'02/03/2019','31/05/2019','08/09/2019','17/12/2019',
'02/08/2017','10/11/2017','30/12/2017','18/02/2018',
'18/02/2018','18/10/2019','30/04/2019','18/09/2019',
'17/11/2019','08/08/2019'
)
expense <- c('300','450','550','980',
'787','300','2343','233',
'932','44','332','432',
'786','345','567','290','345','876')
data <- data.frame(name,
date=lubridate::dmy(date),expense)
发布于 2020-05-15 17:16:58
考虑到3个月为90天,我们可以从每个max
日期减去90天,并且只对属于范围内的日期取expense
平均值。
library(dplyr)
data %>%
group_by(name) %>%
summarise(last_3_month_expense = mean(expense[date > max(date) - 90], na.rm = TRUE),
mean_expense = mean(expense, na.rm = TRUE))
数据
将费用数据读取为数字,而不是作为因素/字符。
data$expense <- as.numeric(as.character(data$expense))
发布于 2020-05-15 16:22:36
我们按“名称”、“日期”进行arrange
,将“费用”转换为“numeric
”,计算“费用”最后3个值的sum
和按“名称”分组的“费用”的mean
(假设每月只有数据点)。
library(dplyr)
data %>%
arrange(name, date) %>%
mutate(expense = as.numeric(as.character(expense))) %>%
group_by(name) %>%
summarise(last_three = sum(tail(expense, 3), na.rm = TRUE),
average_expense = mean(expense, na.rm = TRUE))
https://stackoverflow.com/questions/61830314
复制相似问题