我一直在寻找以下问题的答案:How can I select deeply nested key:values from dictionary in python
但我的问题不是在深度嵌套的数据结构中找到单个键,而是一个特定键的所有匹配项。
例如,如果我们修改这里的第一个示例中的数据结构:
[ "stats":{ "success": true, "payload": { "tag": { "slug": "python", "name": "Python", "postCount": 10590, "virtuals": { "isFollowing": false } }, "metadata": { "followerCount": 18053, "postCount": 10590, "coverImage": { "id": "1*O3-jbieSsxcQFkrTLp-1zw.gif", "originalWidth": 550, "originalHeight": 300 } } } }, "stats": { "success": true, "payload": { "tag": { "slug": "python", "name": "Python", "postCount": 10590, "virtuals": { "isFollowing": false } }, "metadata": { "followerCount": 18053, "postCount": 10590, "coverImage": { "id": "1*O3-jbieSsxcQFkrTLp-1zw.gif", "originalWidth": 550, "originalHeight": 300 } } } } ]
我如何在这里获取所有可能出现的“元数据”?
发布于 2018-02-10 10:23:54
用递归的方式怎么样?
def extractVals(obj, key, resList):
if type(obj) == dict:
if key in obj:
resList.append(obj[key])
for k, v in obj.items():
extractVals(v, key, resList)
if type(obj) == list:
for l in obj:
extractVals(l, key, resList)
resultList1 = []
extractVals(dat, 'metadata', resultList1)
print(resultList1)收益率:
[{'coverImage': {'id': '1*O3-jbieSsxcQFkrTLp-1zw.gif',
'originalHeight': 300,
'originalWidth': 550},
'followerCount': 18053,
'postCount': 10590},
{'coverImage': {'id': '1*O3-jbieSsxcQFkrTLp-1zw.gif',
'originalHeight': 300,
'originalWidth': 550},
'followerCount': 18053,
'postCount': 10590}]我还必须稍微修改一下上面的数据集,使其成为有效的Python结构。true -> True、false -> False,并从顶级列表中删除了这些键。
发布于 2018-02-10 10:53:47
您可以像这样使用custon类:
class DeepDict:
def __init__(self, data):
self.data = data
@classmethod
def _deep_find(cls, data, key, root, response):
if root:
root += "."
if isinstance(data, list):
for i, item in enumerate(data):
cls._deep_find(item, key, root + str(i), response)
elif isinstance(data, dict):
if key in data:
response.append(root + key)
for data_key, value in data.items():
cls._deep_find(value, key, root + data_key, response)
return response
def deep_find(self, key):
""" Returns all ocurrences of `key` with a dottedpath leading to each.
Use `deepget` to retrieve the values for a given ocurrence, or
`get_all` to iterate over the values for each occurrence of the key.
"""
return self._deep_find(self.data, key, root="", response=[])
@classmethod
def _deep_get(cls, data, path):
if not path:
return data
index = path.pop(0)
if index.isdigit():
index = int(index)
return cls._deep_get(data[index], path)
def deep_get(self, path):
if isinstance(path, str):
path = path.split(".")
return self._deep_get(self.data, path)
def get_all(self, key):
for path in self.deep_find(key):
yield self.deep_get(path)
def __getitem__(self, key):
if key.isdigit():
key = int(key)
return self.data[key](请注意,尽管我将其命名为"DeepDict“,但它实际上是一个通用的JSON容器,可以将列表和字典作为外部元素使用。顺便说一句,你问题中的JSON片段坏了--两个"stats":密钥都应该包装在一个额外的{ }中)
因此,这三个自定义方法可以找到每个键出现的精确“路径”,或者,您可以使用get_all方法作为迭代器来简单地获取结构中有多少具有该名称的键的内容。
使用上面的类,在修复了您的数据之后,我做到了:
data = DeepDict(<data structure above (fixed)>)
list(data.get_all("metadata"))并作为输出:
[{'coverImage': {'id': '1*O3-jbieSsxcQFkrTLp-1zw.gif',
'originalHeight': 300,
'originalWidth': 550},
'followerCount': 18053,
'postCount': 10590},
{'coverImage': {'id': '1*O3-jbieSsxcQFkrTLp-1zw.gif',
'originalHeight': 300,
'originalWidth': 550},
'followerCount': 18053,
'postCount': 10590}]https://stackoverflow.com/questions/48716506
复制相似问题