首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >如何将datetime对象或interval对象分解为R中的逐行分钟

如何将datetime对象或interval对象分解为R中的逐行分钟
EN

Stack Overflow用户
提问于 2019-06-15 10:31:02
回答 2查看 226关注 0票数 0

我有一个列日期时间(start)和datetime_end的数据集。在数据操作之后,我想按每一行每分钟来分解这个间隔--假设我有这个间隔

代码语言:javascript
运行
复制
datetime                datetime_end          id   disc
2019-03-19 12:47:28     2019-03-19 12:50:37   5-3 start

我想把它分成几分钟来做这样的事情:

代码语言:javascript
运行
复制
    datetime                  id   disc
2019-03-19 12:48:00           5-3 start
2019-03-19 12:49:00           5-3 start
2019-03-19 12:50:00           5-3 start
2019-03-19 12:51:00           5-3 start

这是假数据

代码语言:javascript
运行
复制
df1 <- data.frame(stringsAsFactors=FALSE,
                  datetime = c("2019-03-19T13:26:52Z", "2019-03-19T13:26:19Z",
                               "2019-03-19T13:23:46Z", "2019-03-19T13:22:20Z",
                               "2019-03-19T13:09:56Z", "2019-03-19T13:06:04Z", "2019-03-19T13:05:21Z",
                               "2019-03-19T13:04:37Z", "2019-03-19T12:47:28Z",
                               "2019-03-19T12:46:42Z"),
                  id = c("5-3", "5-3", "5-3", "5-3", "5-3", "5-3", "5-3", "5-3", "5-3",
                         "5-3"),
                  disc = c("car", "stop", "start", "stop", "start", "stop", "start",
                           "stop", "start", "stop")
)

我试着使用lubridate::interval函数来创建一个interval对象(旅行间隔),但是我很难按每一行分钟来分解它(如上面所示)。所以,如果有人知道解决办法,我会非常感激的。

这是我的剧本

代码语言:javascript
运行
复制
library(tidyverse)
library(lubridate)
  df <- df1 %>% 
    mutate(datetime = lubridate::as_datetime(datetime)) %>% 
    arrange(datetime) %>% 
    mutate(datetime_end = lead(datetime), 
           # Create an interval object.
           Travel_Interval = 
             lubridate::interval(start = datetime, end = datetime_end)) %>% 
    filter(!is.na(Travel_Interval)) %>% 
    # select(-Travel_Interval)
    select(datetime,datetime_end , id , disc,Travel_Interval) %>% 
    filter(disc == "start")
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-06-15 11:04:44

我会用purrr::map2()来做这个:

代码语言:javascript
运行
复制
# take df1 %>% mutate datetime column to datetime format %>% sort by datetime
# %>% add datetime_end as lead of datetime %>% filter out records with no
# recorded datetime_end %>% mutate to create column 'minute' by using
# purrr::map2 to iterate over each datetime and datetime_end pair and apply the
# following function {create an sequence of datestamps starting at the "minute
# ceiling" of 'start'datetime' and ending at the "minute ceiling" of
# 'datetime_end in one minute intervals} %>% since the resultant column is a
# list, we have to unnest the data
df <- df1 %>% 
  mutate(datetime = as_datetime(datetime)) %>% 
  arrange(datetime) %>% 
  mutate(datetime_end = lead(datetime, n = 1L)) %>% 
  filter(!is.na(datetime_end)) %>% 
  mutate(minute = purrr::map2(datetime, datetime_end, function(start, stop) {
    seq.POSIXt(from = ceiling_date(start, 'minute'), to = ceiling_date(stop, 'minute'), by = 'min')
  })) %>% 
  unnest()

但是,请注意,由于您使用某种形式的舍入(在这种情况下使用上限)将时间戳有效地缩短为分钟间隔,您将不得不决定如何处理边界情况。例如:disc == "stop“的第一行将以minute == 2019-03-19 12:48:00结束,但随后的disc == " start”_run的第一行也将以minute == 2019-03-19 12:48:00开始:

代码语言:javascript
运行
复制
              datetime  id  disc        datetime_end              minute
1  2019-03-19 12:46:42 5-3  stop 2019-03-19 12:47:28 2019-03-19 12:47:00
2  2019-03-19 12:46:42 5-3  stop 2019-03-19 12:47:28 2019-03-19 12:48:00
3  2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:48:00
4  2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:49:00
5  2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:50:00
6  2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:51:00
7  2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:52:00
8  2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:53:00
9  2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:54:00
10 2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:55:00
11 2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:56:00
12 2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:57:00
13 2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:58:00
14 2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 12:59:00
15 2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 13:00:00
16 2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 13:01:00
17 2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 13:02:00
18 2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 13:03:00
19 2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 13:04:00
20 2019-03-19 12:47:28 5-3 start 2019-03-19 13:04:37 2019-03-19 13:05:00
21 2019-03-19 13:04:37 5-3  stop 2019-03-19 13:05:21 2019-03-19 13:05:00
22 2019-03-19 13:04:37 5-3  stop 2019-03-19 13:05:21 2019-03-19 13:06:00
票数 2
EN

Stack Overflow用户

发布于 2019-06-15 11:33:30

代码语言:javascript
运行
复制
df1 %>% 
  mutate(datetime = lubridate::as_datetime(datetime)) %>% 
  arrange(datetime) %>% 
  mutate(datetime_end = lead(datetime)) %>%
  filter(!is.na(datetime_end)) %>%
  mutate_at(vars(contains("datetime")), ~ round_date(.x + seconds(30), unit = "minute")) %>%
  mutate(diff = time_length(interval(datetime, datetime_end), unit = "minutes")) %>%
  mutate(time = map2(datetime, diff, ~ .x + minutes(seq(0, .y)))) %>%
  unnest(time)

我只是想贴出来,因为我已经在写了--尽管已经有了很好的答案。这使用lubridate函数time_lengthinterval来获取序列。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/56609502

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档