我有一个关于大图数据的问题。假设我们有一个有近1亿条边和大约500万个节点的大图,在这种情况下,您所知道的最好的图挖掘平台是什么,它可以给出任意两个给定节点之间的所有简单的长度路径<=k ( k=3,4,5)。主要关注的是获得这些路径的速度。另一件事是,图是有向的,但是我们希望程序在计算路径时忽略方向,但当它发现这些路径时仍然返回实际有向边。
例如:
-> c <- d -> b是长度为3的节点'a‘和'b’之间的有效路径。
提前谢谢。
发布于 2015-01-31 08:28:31
所以这是一种在networkx中实现它的方法。它大致是基于我给这里的解决方案。我假设a->b
和a<-b
是您想要的两条不同的路径。我将把它作为列表返回。每个子列表都是路径的(有序)边。
import networkx as nx
import itertools
def getPaths(G,source,target, maxLength, excludeSet=None):
#print source, target, maxLength, excludeSet
if excludeSet== None:
excludeSet = set([source])
else:
excludeSet.add(source)# won't allow a path starting at source to go through source again.
if maxLength == 0:
excludeSet.remove(source)
return []
else:
if G.has_edge(source,target):
paths=[[(source,target)]]
else:
paths = []
if G.has_edge(target,source):
paths.append([(target,source)])
#neighbors_iter is a big iterator that will give (neighbor,edge) for each successor of source and then for each predecessor of source.
neighbors_iter = itertools.chain(((neighbor,(source,neighbor)) for neighbor in G.successors_iter(source) if neighbor != target),((neighbor,(neighbor,source)) for neighbor in G.predecessors_iter(source) if neighbor != target))
#note that if a neighbor is both a predecessor and a successor, it shows up twice in this iteration.
paths.extend( [[edge] + path for (neighbor,edge) in neighbors_iter if neighbor not in excludeSet for path in getPaths(G,neighbor,target,maxLength-1,excludeSet)] )
excludeSet.remove(source) #when we move back up the recursion, don't want to exclude this source any more
return paths
G=nx.DiGraph()
G.add_edges_from([(1,2),(2,3),(1,3),(1,4),(3,4),(4,3)])
print getPaths(G,1,3,2)
>[[(1, 3)], [(1, 2), (2, 3)], [(1, 4), (4, 3)], [(1, 4), (3, 4)]]
我希望通过修改networkx中的dijkstra算法,您将得到一个更有效的算法(注意,dijkstra算法有一个截止点,但默认情况下它只返回最短路径,并且它将沿着边缘方向前进)。
下面是整个paths.extend的另一个版本: paths.extend( neighbors_iter中的[edge + path (邻居,edge) ),如果excludeSet中没有邻居,则为getPaths中的路径(G,邻居,目标,maxLength-1,excludeSet),如果len( path )>0 ])
发布于 2015-01-30 16:34:03
我建议您使用易于操作和学习的伤寒。
但是,如果您找到了它,Neo4j将通过一点编码来满足您的需求。
https://stackoverflow.com/questions/28240325
复制相似问题