首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >将列表与重叠元素组合

将列表与重叠元素组合
EN

Stack Overflow用户
提问于 2019-06-13 00:57:28
回答 3查看 399关注 0票数 0

我有一个列表集合,其中一些有重叠的元素:

代码语言:javascript
复制
coll = [['aaaa', 'aaab', 'abaa'],
        ['bbbb', 'bbbb'], 
        ['aaaa', 'bbbb'], 
        ['dddd', 'dddd'],
        ['bbbb', 'bbbb', 'cccc','aaaa'],
        ['eeee','eeef','gggg','gggi'],
        ['gggg','hhhh','iiii']]

我只想将重叠的列表集中在一起,这将产生

代码语言:javascript
复制
pooled = [['aaaa', 'aaab', 'abaa','bbbb','cccc'], 
          ['eeee','eeef','gggg','gggi','hhhh','iiii'],
          ['dddd', 'dddd']]

(如果不清楚,第一个列表和第二个列表都与第三个列表重叠,因此应该全部合并在一起,即使它们本身并不包含共同的元素。)

“重叠”表示两个列表至少有一个元素相同。“合并”意味着将两个列表合并为一个平面列表或一个平面集合。

可能存在多个集合,例如x、y和z彼此重叠,v和w彼此重叠,但x+y+z不与v+w重叠。并且可能存在不与任何内容重叠的列表。

(一个类比就是家庭。一起加入所有的蒙太古,一起加入所有的卡布利特,但没有蒙塔古与卡布利特结婚,所以这两个集群将保持不同。)

我不关心重复的项目是否会被多次包含。

在Python中执行此操作的简单且相当快速的方法是什么?

编辑:这似乎不是Yet another merging list of lists, but most pythonic way的副本,因为它似乎不考虑仅通过第三个集合重叠的组。我从这个问题中尝试的解决方案并没有给出我在这里寻找的答案。

EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2019-06-15 04:55:24

根据alkasm在评论中的建议,我使用了networkx:

代码语言:javascript
复制
import networkx as nx

coll = [['aaaa', 'aaab', 'abaa'],
        ['bbbb', 'bbbb'], 
        ['aaaa', 'bbbb'], 
        ['dddd', 'dddd'],
        ['bbbb', 'bbbb', 'cccc','aaaa'],
        ['eeee','eeef','gggg','gggi'],
        ['gggg','hhhh','iiii']]

edges = []
for i in range(len(coll)):
    a = coll[i]
    for j in range(len(coll)):
        if i != j:
            b = coll[j]
            if set(a).intersection(set(b)):
                edges.append((i,j))

G = nx.Graph()
G.add_nodes_from(range(len(coll)))
G.add_edges_from(edges)

for c in nx.connected_components(G):
    combined_lists = [coll[i] for i in c]
    flat_list = [item for sublist in combined_lists for item in sublist]
    print(set(flat_list))

输出:

代码语言:javascript
复制
{'cccc', 'bbbb', 'aaab', 'aaaa', 'abaa'}
{'dddd'}
{'eeef', 'eeee', 'hhhh', 'gggg', 'gggi', 'iiii'}

毫无疑问,这是可以优化的,但它似乎暂时解决了我的问题。

票数 0
EN

Stack Overflow用户

发布于 2019-06-13 01:15:44

有一种方法可以做到这一点(假设你想要在重叠的结果上有唯一的元素):

代码语言:javascript
复制
def over(coll):
     print('Input is:\n', coll)
     # gather the lists that do overlap 
     overlapping = [x for x in coll if any(x_element in [y for k in coll if k != x for y in k] for x_element in x)] 
     # flatten and get unique 
     overlapping = sorted(list(set([z for x in overlapping for z in x]))) 
     # get the rest
     non_overlapping = [x for x in coll if all(y not in overlapping for y in x)] 
     # use the line bellow only if merged non-overlapping elements are desired
     # non_overlapping = sorted([y for x in non_overlapping for y in x]) 
     print('Output is"\n',[overlapping, non_overlapping])

coll = [['aaaa', 'aaab', 'abaa'],
        ['bbbb', 'bbbb'], 
        ['aaaa', 'bbbb'], 
        ['dddd', 'dddd'],
        ['bbbb', 'bbbb', 'cccc','aaaa']]
over(coll)
coll = [['aaaa', 'aaaa'], ['bbbb', 'bbbb']]
over(coll)

输出:

代码语言:javascript
复制
$ python3 over.py                                                                                                                                                              -- NORMAL --
Input is:
 [['aaaa', 'aaab', 'abaa'], ['bbbb', 'bbbb'], ['aaaa', 'bbbb'], ['dddd', 'dddd'], ['bbbb', 'bbbb', 'cccc', 'aaaa']]
Output is"
 [['aaaa', 'aaab', 'abaa', 'bbbb', 'cccc'], [['dddd', 'dddd']]]
Input is:
 [['aaaa', 'aaaa'], ['bbbb', 'bbbb']]
Output is"
 [[], [['aaaa', 'aaaa'], ['bbbb', 'bbbb']]]
票数 1
EN

Stack Overflow用户

发布于 2019-06-18 11:24:04

您可以使用连续合并方法对集合执行此操作:

代码语言:javascript
复制
coll = [['aaaa', 'aaab', 'abaa'],
        ['bbbb', 'bbbb'], 
        ['aaaa', 'bbbb'], 
        ['dddd', 'dddd'],
        ['bbbb', 'bbbb', 'cccc','aaaa'],
        ['eeee','eeef','gggg','gggi'],
        ['gggg','hhhh','iiii']]

pooled = [set(subList) for subList in coll]
merging = True
while merging:
    merging=False
    for i,group in enumerate(pooled):
        merged = next((g for g in pooled[i+1:] if g.intersection(group)),None)
        if not merged: continue
        group.update(merged)
        pooled.remove(merged)
        merging = True

print(pooled)
# [{'aaaa', 'abaa', 'aaab', 'cccc', 'bbbb'}, {'dddd'}, {'gggg', 'eeef', 'eeee', 'hhhh', 'gggi', 'iiii'}]
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/56567089

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档