首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >优化网络范围函数

优化网络范围函数
EN

Stack Overflow用户
提问于 2022-02-08 07:48:40
回答 1查看 104关注 0票数 2

我发现了Richard在R中计算网络范围函数的博客帖子 (Burt1981;Reagans和McEvily 2003)。该函数根据每个网络节点的接触数和这些节点的互连性为其分配值。网络范围可以通过节点的子组(例如,女性和男性节点)来计算。它们作为顶点的属性存储。

作者的例子非常具有说明性,但它基于一个相对较小的网络(大约100个节点)。我有一个大约有200000个节点的网络,这意味着这个函数的性能不适合我的分析。

我给你们举一个例子,根据Erdos-Renyi模型,随机图的大小可以被操纵,以计时函数的性能。

我不熟悉优化R码,但我认为需要更有效地存储邻接矩阵(例如,稀疏矩阵)。到目前为止,我的努力没有成功地发挥工作作用。

代码语言:javascript
运行
复制
rm(list=ls())

library(igraph)    
library(statnet)
library(intergraph)
library(tictoc)

set.seed(42)



## Source: https://ramorel.github.io/network-range/
## Function to find network range for each node in a network
## Arguments:
##  net = adjacency matrix, igraph graph, or network object
##  attr = Vector of attributes associated with each node in net
##  directed = boolean indicated if the network is directed or not


netrange <- function(net, attr, directed = TRUE){
  require(reshape2)
  if (class(net) == "igraph") {
    net <- as_adjacency_matrix(net, sparse = F)
  }
  else {
    if(class(net) == "network") {
      net <- as.matrix.network(net)
    }
    else {
      net <- as.matrix(net)
    }
  }
  if(nrow(net) != length(attr)) {
    stop("Number of nodes must match length of attributes vector")
  }
  else {
    if (directed == TRUE){
      ns <- colnames(net)
      el <- melt(net, varnames=c("ego", "alter"), value.name = "weight")
      df <- cbind(rownames(net), attr)
      el$ego_grp <- df[match(el[,1], df[,1]), 2]
      el$alter_grp <- df[match(el[,2], df[,1]), 2]
      
      #FINDING p_k, the strength of ties within each group
      # z_iq = sum of strength of ties from nodes in group _k_ to all other alters
      # z_ij = sum of strength of ties from nodes in group _k_ to alters in group _k_
      
      z_iq <- sapply(unique(attr), function(x) {
        sum(el[which(el$ego_grp==x), "weight"])
      })
      z_ij <- sapply(unique(attr), function(x) {
        sum(el[which(el$ego_grp==x & el$alter_grp==x), "weight"])
      })
      p_k <- z_ij / z_iq
      p_k[is.na(p_k)] <- 0
      
      #FINDING p_ik, the strength of connection from person i to group k
      # x_iq = sum of strength of ties for _i_ to alters in group _k_
      # x_ij = sum of strength of ties for _i_ to all alters
      
      x_ij <- sapply(colnames(net), function(x) {
        sum(el[which(el$ego==x), "weight"])
      }
      )
      x_iq <- list(NULL)
      for(i in colnames(net)) {
        x_iq[[i]] <- sapply(unique(attr), function(x) {
          sum(el[which(el$ego==i & el$alter_grp==x), "weight"])
        }
        )
      }
      x_iq <- x_iq[-c(1)] #x_iq is now a list where each elements is a vector of node _i_ summed strength of tie to group _k_
      
      p_ik <- lapply(1:length(x_iq), 
                     function(x) x_iq[[x]] / x_ij[x])
      
      # FINDING nd_i, the network diversity score for node _i_
      
      nd_i <- sapply(1:length(p_ik), 
                     function(x) 1 - sum(p_k*p_ik[[x]]^2, na.rm = F)
      )
    }
    else {
      ns <- colnames(net)
      el <- melt(net, varnames=c("ego", "alter"), value.name = "weight")
      dup <- data.frame(t(apply(el[,1:2],1,sort)))
      
      el <- el[!duplicated(dup),]
      df <- cbind(rownames(net), attr)
      el$ego_grp <- df[match(el[,1], df[,1]), 2]
      el$alter_grp <- df[match(el[,2], df[,1]), 2]
      
      #FINDING p_k, the strength of ties within each group
      # z_iq = sum of strength of ties from nodes in group _k_ to all other alters
      # z_ij = sum of strength of ties from nodes in group _k_ to alters in group _k_
      
      z_iq <- sapply(unique(attr), function(x) {
        sum(el[which(el$ego_grp==x | el$alter_grp==x), "weight"])
      })
      z_ij <- sapply(unique(attr), function(x) {
        sum(el[which(el$ego_grp==x & el$alter_grp==x), "weight"])
      })
      p_k <- z_ij / z_iq
      p_k[is.na(p_k)] <- 0
      
      #FINDING p_ik, the strength of connection from person i to group k
      # x_iq = sum of strength of ties for _i_ to alters in group _k_
      # x_ij = sum of strength of ties for _i_ to all alters
      
      x_ij <- sapply(colnames(net), function(x) {
        sum(el[which(el$ego==x | el$alter==x), "weight"])
      }
      )
      x_iq <- list(NULL)
      for(i in colnames(net)) {
        x_iq[[i]] <- sapply(unique(attr), function(x) {
          sum(el[which(el$ego==i & el$alter_grp==x), "weight"],
              el[which(el$alter==i & el$ego_grp==x), "weight"])
        }
        )
      }
      x_iq <- x_iq[-c(1)] #x_iq is now a list where each elements is a vector of node _i_ summed strength of tie to group _k_
      
      p_ik <- lapply(1:length(x_iq), 
                     function(x) x_iq[[x]] / x_ij[x])
      
      
      # FINDING nd_i, the network diversity score for node _i_
      
      nd_i <- sapply(1:length(p_ik), 
                     function(x) 1 - sum(p_k*p_ik[[x]]^2, na.rm = F)
      )
    }
    return(nd_i)
  }
}


# Generate exemplary network

g <- igraph::erdos.renyi.game(1000, 150, type = "gnm")

## Add categorical (binary) vertex feature: female

V(g)$female <- sample(c(0,1), replace=TRUE, size=length(V(g)))
V(g)$female


## transform igraph to statnet 

net <- intergraph::asNetwork(g)


## Apply network function 

tic()
range_female <- netrange(net, 
                         net %v% "female",
                         directed = T)
seq_time <- toc()
EN

回答 1

Stack Overflow用户

发布于 2022-02-08 12:10:44

最好的优化是使用profvis分析,它显示了你的瓶颈在哪里。不过,我也参加了你的工作。

在那里,我们看到x_ij <- sapply(colnames(net), function(x) { sum(el[which(el$ego==x | el$alter==x), "weight"]) })花了很长时间。

我更喜欢data.table自己,所以在这里提供代码来加速这个部分,它只需几秒钟。然后尝试同样的方法来创建x_iq。

代码语言:javascript
运行
复制
xx_ij <- el[, .(xxij = lapply(.SD, sum)), by = c("ego"), .SDcols = c("weight")]
xx_ij2 <- xx_ij$xxij
names(xx_ij2) <- xx_ij$ego

identical(x_ij, unlist(xx_ij2))
# TRUE

**profiling

代码语言:javascript
运行
复制
library(profvis)
p <- profvis({
  ## code here
})

f <- paste0("profile_", as.Date(now()), ".html")
htmlwidgets::saveWidget(p, f)
browseURL(f)
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71030187

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档