这是我想要附加数据的文件
Collection A
Jan
Feb
March
April
Collection B
Jan
Feb
March
April
Revenue A
Jan
Feb
March
April
Revenue B
Jan
Feb
March
April
我想从中提取数据的文件如下所示:
Collection Month Collection A Collection B Revenue Month Revenue A Revenue B
Collection January 1 5 Revenue January 4 8
Collection February 2 6 Revenue February 3 7
Collection March 3 7 Revenue March 2 6
Collection April 4 8 Revenue April 1 5
我希望最终输出看起来像这样:
Collection A
Jan 1
Feb 2
March 3
April 4
Collection B
Jan 5
Feb 6
March 7
April 8
Revenue A
Jan 4
Feb 3
March 2
April 1
Revenue B
Jan 8
Feb 7
March 6
April 5
我可以使用OFFSET和INDIRECT函数在excel上运行它。但我希望将其更好地自动化以备将来使用,所以我在R上尝试它。
我真的很困惑如何组合两个数据集来找到所需的输出。对我来说这似乎是一项不可能的任务。我玩了几个函数,如select,subset和Arrange by,它们都没有帮助我进步。
如果有人可以帮我解决这个问题,我将很高兴。
发布于 2019-06-19 10:21:58
这是实现该输出的一种方法。请注意,我从示例数据中的列名中删除了空格,以便更容易读入R.您没有指定输出数据帧的列名称,因为它们没有意义。
library(tidyverse)
tbl <- read_table2(
"Collection Month CollectionA CollectionB Revenue Month RevenueA RevenueB
Collection January 1 5 Revenue January 4 8
Collection February 2 6 Revenue February 3 7
Collection March 3 7 Revenue March 2 6
Collection April 4 8 Revenue April 1 5"
)
#> Warning: Duplicated column names deduplicated: 'Month' => 'Month_1' [6]
tbl %>%
select(-Collection, -Revenue, -Month_1) %>%
gather(variable, value, -Month) %>%
group_by(variable) %>%
group_modify(~ add_row(.x, Month = .y$variable, value = NA, .before = 1)) %>%
ungroup() %>%
select(-variable)
#> # A tibble: 20 x 2
#> Month value
#> <chr> <dbl>
#> 1 CollectionA NA
#> 2 January 1
#> 3 February 2
#> 4 March 3
#> 5 April 4
#> 6 CollectionB NA
#> 7 January 5
#> 8 February 6
#> 9 March 7
#> 10 April 8
#> 11 RevenueA NA
#> 12 January 4
#> 13 February 3
#> 14 March 2
#> 15 April 1
#> 16 RevenueB NA
#> 17 January 8
#> 18 February 7
#> 19 March 6
#> 20 April 5
由reprex包创建于2019-06-18 (v0.3.0)
https://stackoverflow.com/questions/-100007022
复制相似问题