我有一个包含日期、证监会、字符串和数字数据的data.table,这些数据反映了事件持续的时间跨度,我想添加一个列来反映每个事件出现的周数。当我在事件发生的每一天都要排一排的时候,这个方法就起作用了,但现在不行了。我怀疑这是因为开始/结束日期列中缺少值。
下面是我的数据的一个例子:
#Required packages
require(data.table)
require(sf)
#Data
ID<-c(1:5)
lon<-c(-86.9655633, -85.8316318, -85.8316318, -82.8316318,-81.8316318)
lat<-c(34.5967589, 33.6598257, 33.6598257, 31.6598257, 32.6598257)
Country<-c("Malaysia", "Malayisia", "Singapore", "Thailand", "Thailand")
City<-c("Penang", "Malacca", "Singapore", "Bangkok", "Chiang Mai")
area_km2<-c(1048, 277, 728, 1569, 40)
start_date<-as.Date(c("2000-01-31", "2000-09-21", "2001-03-17", NA, NA))
end_date<-as.Date(c("2000-02-14", "2000-10-12", "2001-05-27", NA, NA))
samp<-data.table(ID, lon, lat, Country, City, area_km2, start_date, end_date)
samp <- st_as_sf(samp, coords = c("lon", "lat"), na.fail=F)我想要的基本上是每周有一排的事件发生。如果事件是从2000年2月3日(Thur)到2000年2月8日(Tue),我想从1月31日开始的一周内安排两排,另一次是从2月7日开始。
samp = samp[, list(
`Week Starting` = if (!is.na(start_date) & !is.na(end_date) & start_date<=end_date)
{seq.Date(start_date, end_date, by = 'week')}
else {as.Date(NA)}),
by = list(ID, lon, lat, Country, City, area_km2)] 这有点复杂,但实际上我正试图为所有的城市和星期建立一个小组,即使在那些时间/地点没有发生任何事件。让我知道我怎样才能做得更好。谢谢!
发布于 2020-12-28 13:41:35
一种建议是在一个辅助表中创建几个星期,并使用第一个表对其进行merge。
备注:
tidyr::unnest,因为它目前是管道这一部分最好的(也是唯一的)工具。我认为data.table开发人员仍在考虑/讨论是否和如何处理(Un)嵌套,所以在此之前我将使用这个。虽然它确实产生了一个tbl而不是一个data.table,但是我们仍然可以毫无问题地使用它。这是在sf::st_as_sf.之前完成的
weeks <- tidyr::unnest(
samp[!is.na(start_date), .(date = Map(seq.Date, start_date, end_date, by = "week")), by = .(ID) ],
date
)
weeks
# # A tibble: 18 x 2
# ID date
# <int> <date>
# 1 1 2000-01-31
# 2 1 2000-02-07
# 3 1 2000-02-14
# 4 2 2000-09-21
# 5 2 2000-09-28
# 6 2 2000-10-05
# 7 2 2000-10-12
# 8 3 2001-03-17
# 9 3 2001-03-24
# 10 3 2001-03-31
# 11 3 2001-04-07
# 12 3 2001-04-14
# 13 3 2001-04-21
# 14 3 2001-04-28
# 15 3 2001-05-05
# 16 3 2001-05-12
# 17 3 2001-05-19
# 18 3 2001-05-26然后简单地左转-join/merge它们:
merge(samp, weeks, all.x = TRUE, by = "ID")
# ID lon lat Country City area_km2 start_date end_date date
# 1: 1 -86.96556 34.59676 Malaysia Penang 1048 2000-01-31 2000-02-14 2000-01-31
# 2: 1 -86.96556 34.59676 Malaysia Penang 1048 2000-01-31 2000-02-14 2000-02-07
# 3: 1 -86.96556 34.59676 Malaysia Penang 1048 2000-01-31 2000-02-14 2000-02-14
# 4: 2 -85.83163 33.65983 Malayisia Malacca 277 2000-09-21 2000-10-12 2000-09-21
# 5: 2 -85.83163 33.65983 Malayisia Malacca 277 2000-09-21 2000-10-12 2000-09-28
# 6: 2 -85.83163 33.65983 Malayisia Malacca 277 2000-09-21 2000-10-12 2000-10-05
# 7: 2 -85.83163 33.65983 Malayisia Malacca 277 2000-09-21 2000-10-12 2000-10-12
# 8: 3 -85.83163 33.65983 Singapore Singapore 728 2001-03-17 2001-05-27 2001-03-17
# 9: 3 -85.83163 33.65983 Singapore Singapore 728 2001-03-17 2001-05-27 2001-03-24
# 10: 3 -85.83163 33.65983 Singapore Singapore 728 2001-03-17 2001-05-27 2001-03-31
# 11: 3 -85.83163 33.65983 Singapore Singapore 728 2001-03-17 2001-05-27 2001-04-07
# 12: 3 -85.83163 33.65983 Singapore Singapore 728 2001-03-17 2001-05-27 2001-04-14
# 13: 3 -85.83163 33.65983 Singapore Singapore 728 2001-03-17 2001-05-27 2001-04-21
# 14: 3 -85.83163 33.65983 Singapore Singapore 728 2001-03-17 2001-05-27 2001-04-28
# 15: 3 -85.83163 33.65983 Singapore Singapore 728 2001-03-17 2001-05-27 2001-05-05
# 16: 3 -85.83163 33.65983 Singapore Singapore 728 2001-03-17 2001-05-27 2001-05-12
# 17: 3 -85.83163 33.65983 Singapore Singapore 728 2001-03-17 2001-05-27 2001-05-19
# 18: 3 -85.83163 33.65983 Singapore Singapore 728 2001-03-17 2001-05-27 2001-05-26
# 19: 4 -82.83163 31.65983 Thailand Bangkok 1569 <NA> <NA> <NA>
# 20: 5 -81.83163 32.65983 Thailand Chiang Mai 40 <NA> <NA> <NA>https://stackoverflow.com/questions/65476607
复制相似问题