我有这样的数据
date apps long
10/22/2013 23:51 A 2
10/22/2013 23:52 B 3
10/22/2013 23:52 C 1
10/23/2013 7:03 C 5
10/23/2013 7:13 A 1
10/23/2013 7:31 B 4
10/23/2013 7:31 A 5
10/23/2013 7:31 B 2
10/24/2013 0:54 B 3
10/24/2013 1:16 C 2
10/24/2013 1:16 C 1
10/24/2013 3:27 A 2
10/24/2013 7:30 A 3
10/24/2013 7:30 A 1我遇到的问题是:我想总结一下A,B,C应用程序每天花费的时间。因此,输出将类似于:
A 10/22/2013 2
A 10/23/2013 6
A 10/24/2013 6
etc...我试过一些语法,但没有起作用,谢谢
发布于 2014-07-02 06:04:24
首先,我假设您的data.frame名为dd。在这里,它是以拷贝/可压纸的形式出现的。
dd <- structure(list(date = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L,
5L, 6L, 7L, 7L, 8L, 9L, 9L), .Label = c("10/22/2013 23:51", "10/22/2013 23:52",
"10/23/2013 7:03", "10/23/2013 7:13", "10/23/2013 7:31", "10/24/2013 0:54",
"10/24/2013 1:16", "10/24/2013 3:27", "10/24/2013 7:30"), class = "factor"),
apps = structure(c(1L, 2L, 3L, 3L, 1L, 2L, 1L, 2L, 2L, 3L,
3L, 1L, 1L, 1L), .Label = c("A", "B", "C"), class = "factor"),
long = c(2L, 3L, 1L, 5L, 1L, 4L, 5L, 2L, 3L, 2L, 1L, 2L,
3L, 1L)), .Names = c("date", "apps", "long"), class = "data.frame", row.names = c(NA,
-14L))您应该将日期转换为适当的日期值。
dd$date <- as.POSIXct(as.character(dd$date), format="%m/%d/%Y %H:%M", tz="GMT")然后,您可以使用aggregate创建一个很好的aggregate,这里使用as.Date来节省时间。
aggregate(long ~ as.Date(date) + apps, dd, FUN=sum)这会返回
as.Date(date) apps long
1 2013-10-22 A 2
2 2013-10-23 A 6
3 2013-10-24 A 6
4 2013-10-22 B 3
5 2013-10-23 B 6
6 2013-10-24 B 3
7 2013-10-22 C 1
8 2013-10-23 C 5
9 2013-10-24 C 3发布于 2014-07-02 06:02:39
我很确定这是在某个地方复制的,但我在前三次搜索中失败了,下面是:
tapply( dat$long, list(dt = format( as.POSIXct(dat$date, "%d-%m-%Y %H:%M"),
"%d-%m-%Y"),
grp=dat$apps ),
sum)发布于 2014-07-02 06:50:34
在Using先生的dplyr上使用dd
library(dplyr)
dd%>%
group_by(apps, date=gsub("\\s+.*","",date))%>%
summarize(long=sum(long))
# apps date long
# 1 A 10/22/2013 2
# 2 A 10/23/2013 6
# 3 A 10/24/2013 6
# 4 B 10/22/2013 3
# 5 B 10/23/2013 6
# 6 B 10/24/2013 3
# 7 C 10/22/2013 1
# 8 C 10/23/2013 5
# 9 C 10/24/2013 3https://stackoverflow.com/questions/24523528
复制相似问题