我尝试从API下载Json文件并将其转换为csv文件,但脚本在解析该json文件时抛出以下错误。
对于每100个记录,json文件将关闭"]“并启动另一个"”。此格式未被接受为json格式。您能建议我如何解析"“和"[”,它们在一个有效的way.The代码中每100条记录出现一次,对于少于100条没有[]括号的记录很好地工作吗?
Error message:
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data
Json文件格式:
**[**
{
"A": "5",
"B": "811",
"C": [
{ "C1": 1,
"C2": "sa",
"C3": 3
}
],
"D": "HH",
"E": 0,
"F": 6
},
{
"A": "5",
"B": "811",
"C": [
{ "C1": 1,
"C2": "fa",
"C3": 3
}
],
"D": "HH",
"E": 0,
"F": 6
}
**]**
**[**
{
"A": "5",
"B": "811",
"C": [
{ "C1": 1,
"C2": "da",
"C3": 3
}
],
"D": "HH",
"E": 0,
"F": 6
}
**]**
代码:
import json
import pandas as pd
from flatten_json import flatten
def json2excel():
file_path = r"<local file path>"
json_list = json.load(open(file_path + '.json', 'r', encoding='utf-8', errors='ignore'))
key_list = ['A', 'B']
json_list = [{k: d[k] for k in key_list} for d in json_list]
# Flatten and convert to a data frame
json_list_flattened = (flatten(d, '.') for d in json_list)
df = pd.DataFrame(json_list_flattened)
# Export to CSV in the same directory with the original file name
export_csv = df.to_csv(file_path + r'.csv', sep=',', encoding='utf-8', index=None, header=True)
def main():
json2excel()
发布于 2021-02-17 08:25:57
我建议先解析从API接收到的数据。预处理后的数据可以稍后提供给JSON解析器。
我想出了一个简单的python代码,这只是对圆括号匹配问题的解决方案的一个小调整。这是我的工作代码,您可以使用它来预处理数据。
def build_json_items(custom_json):
open_tup = tuple('({[')
close_tup = tuple(')}]')
map = dict(zip(open_tup, close_tup))
queue = []
json_items = []
temp = ""
for i in custom_json:
if i in open_tup:
queue.append(map[i])
elif i in close_tup:
if not queue or i != queue.pop():
return "Unbalanced"
if len(queue) == 0:
# We have reached to a point where everything so far is balanced.
# This is the point where we can separate out the expression
temp = temp + str(i)
json_items.append(temp)
temp = "" # Re-initialize
else:
temp = temp + str(i)
if not queue:
# Provided string is balanced
return True, json_items
else:
return False, json_items
此build_json_items
函数将获取您的自定义JSON有效负载,并将根据您在问题中提供的信息解析单个有效的JSON项。下面是一个如何触发此函数的示例。您可以使用以下代码。
input_data = "[{\"A\":\"5\",\"B\":\"811\",\"C\":[{\"C1\":1,\"C2\":\"sa\",\"C3\":3}],\"D\":\"HH\",\"E\":0,\"F\":6},{\"A\":\"5\",\"B\":\"811\",\"C\":[{\"C1\":1,\"C2\":\"fa\",\"C3\":3}],\"D\":\"HH\",\"E\":0,\"F\":6}][{\"A\":\"5\",\"B\":\"811\",\"C\":[{\"C1\":1,\"C2\":\"da\",\"C3\":3}],\"D\":\"HH\",\"E\":0,\"F\":6}]"
is_balanced, json_items = build_json_items(input_data)
print(f"Available JSON items: {len(json_items)}")
print("JSON items are the following")
for i in json_items:
print(i)
以下是print语句的输出。
Available JSON items: 2
JSON items are the following
[{"A":"5","B":"811","C":[{"C1":1,"C2":"sa","C3":3}],"D":"HH","E":0,"F":6},{"A":"5","B":"811","C":[{"C1":1,"C2":"fa","C3":3}],"D":"HH","E":0,"F":6}]
[{"A":"5","B":"811","C":[{"C1":1,"C2":"da","C3":3}],"D":"HH","E":0,"F":6}]
你可以使用directly run and see the output here。
一旦在有效的JSON结构中分离了这些有效负载,就可以将它们提供给JSON解析器。
https://stackoverflow.com/questions/66180343
复制相似问题