首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >根据单独数据帧中的因子查找行平均值

根据单独数据帧中的因子查找行平均值
EN

Stack Overflow用户
提问于 2020-07-15 23:36:00
回答 3查看 56关注 0票数 1

我有一个很大的数据帧,df1,看起来像这样:

代码语言:javascript
运行
复制
          Gene  CB_1.1 CB_10.1 CB_10.2 CB_10.3
1         Gene1     10       0       0       0
2         Gene2    871       7       9       2
3         Gene3    490       2       5       8
4         Gene4     17       5       6       1
5         Gene5     75       1       1       1
6         Gene6    308       2       6       2

> dput(head(df1[,1:5]))
structure(list(X = c("Gene1", "Gene2", "Gene3", 
"Gene4", "Gene5", "Gene6"), CB_1.1 = c(10L, 
871L, 490L, 17L, 75L, 308L), CB_10.1 = c(0L, 7L, 2L, 5L, 1L, 
2L), CB_10.2 = c(0L, 9L, 5L, 6L, 1L, 6L), CB_10.3 = c(0L, 2L, 
8L, 1L, 1L, 2L)), row.names = c(NA, 6L), class = "data.frame")

第二个数据帧,df2,看起来像这样。

代码语言:javascript
运行
复制
  tissue_subcluster    Class_2
1            CB_1.1     Neuron
2           CB_10.1     Neuron
3           CB_10.2 Non-Neuron
4           CB_10.3 Non-Neuron

> dput(head(df2[,c(7,9)]))
structure(list(tissue_subcluster = c("CB_1.1", "CB_10.1", "CB_10.2", 
"CB_10.3", "CB_11.1", "CB_11.2"), Class_2 = c("Neuron", "Non-Neuron", 
"Non-Neuron", "Non-Neuron", "Non-Neuron", "Non-Neuron")), row.names = c("1", 
"2", "3", "4", "5", "6"), class = "data.frame")

我想根据它们是Neuron因子还是df2中的Non-neuron因子对df1中的值进行平均。这样看起来就像这样:

代码语言:javascript
运行
复制
          Gene Neuron_mean Non-Neuron_mean 
1         Gene1         5               0       
2         Gene2       439             5.5       
3         Gene3       246             6.2       
4         Gene4        11             3.5       
5         Gene5        38               1       
6         Gene6       155               4       

我该怎么做呢?如有任何帮助,我们不胜感激!

EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2020-07-15 23:51:41

使用reshape库,

代码语言:javascript
运行
复制
library(reshape)

out <- merge(melt(df1),df2, by.x = "variable", by.y = "tissue_subcluster")
cast(out, Gene~Class_2,mean)

给予,

代码语言:javascript
运行
复制
   Gene Neuron Non-Neuron
1 Gene1      5        0.0
2 Gene2    439        5.5
3 Gene3    246        6.5
4 Gene4     11        3.5
5 Gene5     38        1.0
6 Gene6    155        4.0
票数 2
EN

Stack Overflow用户

发布于 2020-07-16 01:46:50

以下是base R的一个选项。将'df1‘的列名与列’corresponding _subcluster‘进行匹配,获取相应的'Class_2’值,使用该值将'df1‘拆分为list of data.frame,使用sapplylist上循环,然后获取rowMeans

代码语言:javascript
运行
复制
data.frame(Gene = df1$X, sapply(split.default(df1[-1], with(df2, 
   Class_2[match(names(df1)[-1], tissue_subcluster)])), rowMeans))
#   Gene Neuron Non.Neuron
#1 Gene1      5        0.0
#2 Gene2    439        5.5
#3 Gene3    246        6.5
#4 Gene4     11        3.5
#5 Gene5     38        1.0
#6 Gene6    155        4.0

数据

代码语言:javascript
运行
复制
df1 <- structure(list(X = c("Gene1", "Gene2", "Gene3", "Gene4", "Gene5", 
"Gene6"), CB_1.1 = c(10L, 871L, 490L, 17L, 75L, 308L), CB_10.1 = c(0L, 
7L, 2L, 5L, 1L, 2L), CB_10.2 = c(0L, 9L, 5L, 6L, 1L, 6L), CB_10.3 = c(0L, 
2L, 8L, 1L, 1L, 2L)), row.names = c(NA, 6L), class = "data.frame")

df2 <- structure(list(tissue_subcluster = c("CB_1.1", "CB_10.1", "CB_10.2", 
"CB_10.3", "CB_11.1", "CB_11.2"), Class_2 = c("Neuron", "Neuron", 
"Non-Neuron", "Non-Neuron", "Non-Neuron", "Non-Neuron")), row.names = c("1", 
"2", "3", "4", "5", "6"), class = "data.frame")
票数 3
EN

Stack Overflow用户

发布于 2020-07-15 23:49:14

对于大型数据集,这可能不是最佳方法,但您可以使用tidyrdplyr

代码语言:javascript
运行
复制
df1 %>%
  pivot_longer(cols=-Gene, names_to="tissue_subcluster") %>%
  left_join(df2, by="tissue_subcluster") %>%
  group_by(Gene, Class_2) %>%
  summarise(mean=mean(value)) %>%
  pivot_wider(names_from="Class_2", names_glue="{Class_2}_mean", values_from="mean")

它会返回

代码语言:javascript
运行
复制
# A tibble: 6 x 3
  Gene          Neuron_mean `Non-Neuron_mean`
  <chr>               <dbl>             <dbl>
1 0610005C13Rik           5               0  
2 0610007P14Rik         439               5.5
3 0610009B22Rik         246               6.5
4 0610009E02Rik          11               3.5
5 0610009L18Rik          38               1  
6 0610009O20Rik         155               4
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/62918537

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档