首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >如何使用关键字的第一次出现作为起点来分组字典中的所有链接?

如何使用关键字的第一次出现作为起点来分组字典中的所有链接?
EN

Stack Overflow用户
提问于 2022-04-27 22:09:25
回答 2查看 56关注 0票数 2

我有这样一本字典,

代码语言:javascript
运行
复制
matches = {'2-8-7 Yaesu, Chuo-Ku': ['Chuo Ward, Yaesu 2-8-7'],
           'Chuo Ward, Yaesu 2-8-7': ['2-8-7 Yaesu, Chuo-Ku'],
           'Fukuoka Bldg 10Th Floor': ['Fukuoka Building, 9Th -10Th Flr.'],
           'Fukuoka Bldg. 8-7 Yaesu Chome': ['2-8-7 Yaesu, Chuo-Ku', 
                                             'Fukuoka Building, 8-7, Yaesu 2 Chome, Chuo-Ku'],
           'Fukuoka Bldg. 9Th Fl': ['Fukuoka Building 9Th Floor'],
           'Fukuoka Building 9Th Floor': ['Fukuoka Bldg. 9Th Fl', 
                                          'Fukuoka Building, 9Th -10Th Flr.']}

我希望通过查找链接(带有键或值)将它们组合在一起,键可以是任何东西(或)--这是您遇到的第一个键,这是起点。

这是我期待的输出,

代码语言:javascript
运行
复制
{'2-8-7 Yaesu, Chuo-Ku': ['Chuo Ward, Yaesu 2-8-7',
                          '2-8-7 Yaesu, Chuo-Ku',
                          'Fukuoka Building, 8-7, Yaesu 2 Chome, Chuo-Ku',
                          'Fukuoka Bldg. 8-7 Yaesu Chome'],
 'Fukuoka Bldg 10Th Floor': ['Fukuoka Building, 9Th -10Th Flr.',
                             'Fukuoka Bldg. 9Th Fl',
                             'Fukuoka Building 9Th Floor',
                             'Fukuoka Bldg. 9Th Fl']}

我试过这个,

代码语言:javascript
运行
复制
unique_lst = set()
merged_matches = dict()
for key, values in matches.items():
    if key not in unique_lst:
        values_lst = []
        for v in values:
            output = matches.get(v)
            for subkeys, subvals in matches.items():
                if key != subkeys and v != subkeys:
                    keyvals = [subkeys] + list(subvals)
                    if v in keyvals:
                        values_lst.extend(keyvals)
            if output:
                values_lst.extend(output)
            values_lst.append(v)

        values_lst = [i for i in values_lst if i != key]
        values_lst = values_lst + [key]
        for v in values_lst:
            unique_lst.add(v)
            
        merged_matches[key] = values_lst

这是我的输出,

代码语言:javascript
运行
复制
# print(merged_matches)

{'Fukuoka Bldg. 9Th Fl': ['Fukuoka Building, 9Th -10Th Flr.',
                          'Fukuoka Building 9Th Floor',
                          'Fukuoka Bldg. 9Th Fl'],
 'Fukuoka Bldg. 8-7 Yaesu Chome': ['Chuo Ward, Yaesu 2-8-7',
                                   '2-8-7 Yaesu, Chuo-Ku',
                                   'Chuo Ward, Yaesu 2-8-7',
                                   '2-8-7 Yaesu, Chuo-Ku',
                                   'Fukuoka Building, 8-7, Yaesu 2 Chome, Chuo-Ku',
                                   'Fukuoka Bldg. 8-7 Yaesu Chome'],
 'Fukuoka Bldg 10Th Floor': ['Fukuoka Building 9Th Floor',
                             'Fukuoka Bldg. 9Th Fl',
                             'Fukuoka Building, 9Th -10Th Flr.',
                             'Fukuoka Building, 9Th -10Th Flr.',
                             'Fukuoka Bldg 10Th Floor']}
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-04-27 22:37:58

国际海事组织,问题归结为找到由字典导出的图的连通分量。可以这样做的一种方法是使用UnionFind数据结构来获取由键和值构造的不相交集的列表。

然后,通过选择一个元素作为键,选择其余元素作为值,从合并的集合中构造一个字典。

代码语言:javascript
运行
复制
from networkx.utils.union_find import UnionFind
c = UnionFind()
for k, lst in matches.items():
    c.union(*[k, *lst])

out = {k: v for k, *v in map(list, c.to_sets())}

输出:

代码语言:javascript
运行
复制
{'Chuo Ward, Yaesu 2-8-7': ['2-8-7 Yaesu, Chuo-Ku',
  'Fukuoka Bldg. 8-7 Yaesu Chome',
  'Fukuoka Building, 8-7, Yaesu 2 Chome, Chuo-Ku'],
 'Fukuoka Building 9Th Floor': ['Fukuoka Bldg 10Th Floor',
  'Fukuoka Bldg. 9Th Fl',
  'Fukuoka Building, 9Th -10Th Flr.']}
票数 5
EN

Stack Overflow用户

发布于 2022-04-27 22:35:50

试图与networkx,一个网络图包的创造性。这个想法是,你的问题可以翻译成“把地址分类成组,没有组之间的联系”。

因此,networkx在这里听起来是一个方便的解决方案。

了解问题所在,

代码语言:javascript
运行
复制
Denote the following:
1 '2-8-7 Yaesu, Chuo-Ku',
2 'Chuo Ward, Yaesu 2-8-7',
3 'Fukuoka Bldg 10Th Floor',
4 'Fukuoka Bldg. 8-7 Yaesu Chome',
5 'Fukuoka Bldg. 9Th Fl',
6 'Fukuoka Building 9Th Floor',
7 'Fukuoka Building, 8-7, Yaesu 2 Chome, Chuo-Ku',
8 'Fukuoka Building, 9Th -10Th Flr.'
---------
Reading from your original list of dictionaries, the addresses have the following relationships:

1 -> 2
2 -> 1
3 -> 8
4 -> 1
4 -> 7
5 -> 6
6 -> 5
6 -> 8

The final output is two dictionaries:
(1, 2, 7, 4) and (3, 5, 8, 6).

(1,2,7,4)和(3,5,8,6)是我们所称的连通分量。因此,执行如下:

代码语言:javascript
运行
复制
matches = {'2-8-7 Yaesu, Chuo-Ku': ['Chuo Ward, Yaesu 2-8-7'],
 'Chuo Ward, Yaesu 2-8-7': ['2-8-7 Yaesu, Chuo-Ku'],
 'Fukuoka Bldg 10Th Floor': ['Fukuoka Building, 9Th -10Th Flr.'],
 'Fukuoka Bldg. 8-7 Yaesu Chome': ['2-8-7 Yaesu, Chuo-Ku',
                                   'Fukuoka Building, 8-7, Yaesu 2 Chome, Chuo-Ku'],
 'Fukuoka Bldg. 9Th Fl': ['Fukuoka Building 9Th Floor'],
 'Fukuoka Building 9Th Floor': ['Fukuoka Bldg. 9Th Fl',
                                'Fukuoka Building, 9Th -10Th Flr.']}

import networkx as nx

G = nx.Graph()
for k,v in matches.items():
    G.add_node(k)
    for i in v:
        G.add_node(i)
        G.add_edge(k, i)

groups = list(nx.connected_components(G))

最后的结果groups是一个集合列表,其中每个集合都是一个孤立的组。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/72035835

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档