我希望解析一个JSON文件,并获得包含访问密钥所需的所有路径的完整列表。如果我们使用键方法,我们会得到单个键的列表,而不是访问数据所需的分层键的完整列表。
所以如果给出这样的数据
data = {
"glossary": {
"title": "example glossary",
"GlossDiv": {
"title": "S",
"GlossList": {
"GlossEntry": {
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Standard Generalized Markup Language",
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"para": "A meta-markup language, used to create markup languages such as DocBook.",
"GlossSeeAlso": ["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
}
我可以返回如下所示的列表,其中包含了所有通往键的完整路径。
[['glossary']['title'],['glossary']['GlossDiv']...]
我使用ChainMap,因为它使将json转换为字典和访问密钥变得更容易。
代码如下:
import json
from collections import ChainMap
from functools import reduce
import operator
myDataChained = ChainMap(data)
def getFromDict(data):
return reduce(operator.getitem, data)
Json_Paths = getFromDict(myDataChained)
print(Json_Paths)
发布于 2018-07-24 16:13:02
你不能使用与链接答案相同的技术进行反向操作-你没有预先要遍历的路径信息。functools.reduce()/operator.getitem()
组合体-你正试图获取这些信息,即规范/平版你的字典结构。
要做到这一点,你必须遍历整个结构并收集全数据中可能存在的路径,类似于:
import collections
def get_paths(source):
paths = []
if isinstance(source, collections.MutableMapping): # found a dict-like structure...
for k, v in source.items(): # iterate over it; Python 2.x: source.iteritems()
paths.append([k]) # add the current child path
paths += [[k] + x for x in get_paths(v)] # get sub-paths, extend with the current
# else, check if a list-like structure, remove if you don't want list paths included
elif isinstance(source, collections.Sequence) and not isinstance(source, str):
# Python 2.x: use basestring instead of str ^
for i, v in enumerate(source):
paths.append([i])
paths += [[i] + x for x in get_paths(v)] # get sub-paths, extend with the current
return paths
现在如果你运行你的data
通过它:
data = {
"glossary": {
"title": "example glossary",
"GlossDiv": {
"title": "S",
"GlossList": {
"GlossEntry": {
"ID": "SGML",
"SortAs": "SGML",
"GlossTerm": "Standard Generalized Markup Language",
"Acronym": "SGML",
"Abbrev": "ISO 8879:1986",
"GlossDef": {
"para": "A meta-markup language, used to create markup languages...",
"GlossSeeAlso": ["GML", "XML"]
},
"GlossSee": "markup"
}
}
}
}
}
paths = get_paths(data)
你会得到paths
包含:
[['glossary'],
['glossary', 'title'],
['glossary', 'GlossDiv'],
['glossary', 'GlossDiv', 'title'],
['glossary', 'GlossDiv', 'GlossList'],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry'],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'ID'],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'SortAs'],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossTerm'],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'Acronym'],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'Abbrev'],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef'],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef', 'para'],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef', 'GlossSeeAlso'],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef', 'GlossSeeAlso', 0],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef', 'GlossSeeAlso', 1],
['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossSee']]
你可以把其中的任何一个functools.reduce()/operator.getitem()
组合以获得目标值。
https://stackoverflow.com/questions/-100005684
复制相似问题