我有两个清单,一个是:
a = [[('e', 18.019570565395412), ('n', 9.987254997438297),
('a', 7.7558209941102), ('r', 6.6337526622659),
('i', 6.600725745328597), ('t', 6.501644994516685),
('o', 6.348366226166633), ('d', 5.847034563938841),
('s', 4.446777970199559), ('l', 3.72314975166299)],
[('e', 12.106222485002089), ('t', 8.957697044103082),
('a', 8.370584890241286), ('n', 7.607979302319451),
('o', 7.490104957124618), ('i', 7.3906837841807365),
('s', 6.619604800837547), ('r', 6.519995330217634),
('h', 4.5520963180272425), ('l', 4.174559477586928)],
[('e', 17.355137469595004), ('s', 8.143220837795097),
('a', 7.8767560437690145), ('n', 7.549126676263891),
('i', 7.163346641798983), ('t', 7.009814697935651),
('r', 6.939253827661279), ('l', 5.823753838298597),
('u', 5.566685341067845), ('o', 5.494351584605674)],
[('e', 11.726720365453488), ('i', 11.143857839435189),
('a', 10.481789164283027), ('o', 8.879509290276063),
('n', 7.433536567715994), ('l', 6.861989205859677),
('t', 6.660947684470068), ('r', 6.473326474063275),
('s', 5.332336897171472), ('c', 4.1076677341515335)],
[('e', 16.01393585408341), ('n', 10.010042012501282),
('i', 7.874987191310585), ('r', 7.499538887181063),
('a', 6.538374833487037), ('s', 6.393687877856339),
('t', 6.1842401885439084), ('d', 5.152577108310278),
('u', 4.3455272056563174), ('l', 3.962701096423814)],
[('e', 13.02338360095244), ('a', 11.820318700775383),
('o', 9.20172171683253), ('s', 7.635081506807498),
('n', 7.547469320471335), ('i', 7.219915745772025),
('r', 6.704927040722877), ('l', 5.650833384211491),
('d', 5.098296599303987), ('t', 4.7109103119848585)]]
这是一个包含6个子列表的列表,另一个列表是:
b = {'e': 1636, 'a': 930, 'd': 581, 'i': 507, 'g': 298, 'h': 222, 'c': 145, 'b': 117, 'j': 104, 'f': 74}
我需要一些代码来执行以下操作:
有最大的交集。
的索引。
重要的注意事项:我只关心每个列表中的键(字符串),这些值不一定要匹配。所以,关于交集,我的意思是:列表b和a的每个子列表有多少共同之处。
发布于 2022-04-12 21:22:22
您可以将a
中的子列表转换为字典,将b
转换为字典;然后在字典键上使用set.intersection
来获取交叉口的大小:
intersection_sizes = [len(dict(s).keys() & dict(b).keys()) for s in a]
输出:
[10, 9, 9, 9, 9, 10]
然后是另一个列表理解,以清除具有最大交集的子集的索引:
index_of_the_subsets = [i for i, x in enumerate(intersection_sizes) if x == max(intersection_sizes)]
输出:
[0, 5]
https://stackoverflow.com/questions/71849187
复制相似问题