首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >将时间段观测值转换为R中的年度观测值

将时间段观测值转换为R中的年度观测值
EN

Stack Overflow用户
提问于 2020-11-03 00:35:14
回答 1查看 39关注 0票数 0

我有一个关于数百个国家危机的数据集(df1),其中每个观察都是国家层面的危机事件,有开始日期和结束日期。我还有危机宣布的日期(yyyy-mm-dd格式),以及其他一些危机特征。

代码语言:javascript
运行
复制
df1 <- data.frame(cbind(eventID=c(1,2,3,4), country=c("ALB","ALB","ARG","ARG"), start=c(1994, 1998, 1998, 1991), end=c(1996,1999,1999,1993), announcement=c("1994-11-01","1998-03-01","1998-07-01","1992-01-01"), x1=c(6,2,8,7), x2=c("a","q","k","b")))

eventID   country    start    end      announcement     x1      x2 
1         ALB        1994     1996     1994-11-01       6       a
2         ALB        1998     1999     1998-03-01       2       q
3         ARG        1998     1999     1998-07-01       8       k
4         ARG        1991     1993     1992-01-01       7       b

我需要制作df2,这是一个由国家组成的小组,从最早的“开始”年到最近的“结束”年进行年度观察。我希望有一个虚拟变量"crisis",对于df1中"start“和"end”之间的年份,它等于1,否则等于0。我希望“公告”在df1中包含当年的公告日期和公告,而"NA“则不然。我希望额外的危机特征,x1和x2,在它们对应的危机年出现,否则显示"NA“。

我还需要对没有国家发生危机的每个国家进行多年的观察(在df2: 1997)。

代码语言:javascript
运行
复制
df2 <- data.frame(cbind(year=c(1991,1992,1993,1994,1995,1996,1997,1998,1999,1991,1992,1993,1994,1995,1996,1997,1998,1999), country=c("ALB","ALB","ALB","ALB","ALB","ALB","ALB","ALB","ALB","ARG","ARG","ARG","ARG","ARG","ARG","ARG","ARG","ARG"),crisis=c(0,0,0,1,1,1,0,1,1,1,1,1,0,0,0,0,1,1), announcement=c(NA, NA,NA,"1994-11-01",NA,NA,NA,"1998-03-01",NA,NA,"1992-01-01",NA,NA,NA,NA,NA,"1998-07-01"), x1=c(NA,NA,NA,6,6,6,NA,2,2,8,8,8,NA,NA,NA,NA,7,7), x2=c(NA,NA,NA,"a","a","a",NA,"q","q","k","k","k",NA,NA,NA,NA,"b","b")))

year      country    crisis   announcement    x1       x2
1991      ALB        0        NA              NA       NA
1992      ALB        0        NA              NA       NA
1993      ALB        0        NA              NA       NA
1994      ALB        1        1994-11-01      6        a
1995      ALB        1        NA              6        a
1996      ALB        1        NA              6        a
1997      ALB        0        NA              NA       NA
1998      ALB        1        1998-03-01      2        q
1999      ALB        1        NA              2        q
1991      ARG        1        NA              8        k
1992      ARG        1        1992-01-01      8        k
1993      ARG        1        NA              8        k
1994      ARG        0        NA              NA       NA
1995      ARG        0        NA              NA       NA
1996      ARG        0        NA              NA       NA
1997      ARG        0        NA              NA       NA
1998      ARG        1        1998-07-01      7        b
1999      ARG        1        NA              7        b

我希望有任何建议!我对于如何复制每年的观察值感到困惑,但仅当我的新“危机”dummy =1时才包含x1和x2值

谢谢!

EN

回答 1

Stack Overflow用户

发布于 2020-11-03 03:16:07

使用dplyr和tidyr可以这样实现:

代码语言:javascript
运行
复制
library(dplyr)
library(tidyr)

df1 <- data.frame(cbind(eventID=c(1,2,3,4), country=c("ALB","ALB","ARG","ARG"), start=c(1994, 1998, 1998, 1991), end=c(1996,1999,1999,1993), announcement=c("1994-11-01","1998-03-01","1998-07-01","1992-01-01"), x1=c(6,2,8,7), x2=c("a","q","k","b")))

df1 %>% 
  mutate(year = factor(start, levels = min(start):max(end))) %>% 
  complete(year, country) %>% 
  mutate(year = as.numeric(as.character(year))) %>% 
  arrange(country, year) %>% 
  group_by(country) %>% 
  fill(eventID, end, x1, x2) %>% 
  ungroup() %>% 
  mutate(across(c(eventID, end, x1, x2), ~ ifelse(end < year, NA, .)),
         crisis = as.numeric(!is.na(eventID)))
#> # A tibble: 18 x 9
#>     year country eventID start end   announcement x1    x2    crisis
#>    <dbl> <chr>   <chr>   <chr> <chr> <chr>        <chr> <chr>  <dbl>
#>  1  1991 ALB     <NA>    <NA>  <NA>  <NA>         <NA>  <NA>       0
#>  2  1992 ALB     <NA>    <NA>  <NA>  <NA>         <NA>  <NA>       0
#>  3  1993 ALB     <NA>    <NA>  <NA>  <NA>         <NA>  <NA>       0
#>  4  1994 ALB     1       1994  1996  1994-11-01   6     a          1
#>  5  1995 ALB     1       <NA>  1996  <NA>         6     a          1
#>  6  1996 ALB     1       <NA>  1996  <NA>         6     a          1
#>  7  1997 ALB     <NA>    <NA>  <NA>  <NA>         <NA>  <NA>       0
#>  8  1998 ALB     2       1998  1999  1998-03-01   2     q          1
#>  9  1999 ALB     2       <NA>  1999  <NA>         2     q          1
#> 10  1991 ARG     4       1991  1993  1992-01-01   7     b          1
#> 11  1992 ARG     4       <NA>  1993  <NA>         7     b          1
#> 12  1993 ARG     4       <NA>  1993  <NA>         7     b          1
#> 13  1994 ARG     <NA>    <NA>  <NA>  <NA>         <NA>  <NA>       0
#> 14  1995 ARG     <NA>    <NA>  <NA>  <NA>         <NA>  <NA>       0
#> 15  1996 ARG     <NA>    <NA>  <NA>  <NA>         <NA>  <NA>       0
#> 16  1997 ARG     <NA>    <NA>  <NA>  <NA>         <NA>  <NA>       0
#> 17  1998 ARG     3       1998  1999  1998-07-01   8     k          1
#> 18  1999 ARG     3       <NA>  1999  <NA>         8     k          1
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64649526

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档