我正试图将csv转换成json,但无法完全搞清楚如何正确使用基于德语字母表的特殊字母。
结果如下:
[
{
"\ufeffMeter-id": "W000001",
"Address": "Groninger Stra\u00dfe 22 , 13347 Berlin",
"January": "",
"February": "",
"March": "",
"April": "",
"May": "",
"June": "",
"July": "",
"August": "",
"September": "",
"October": "",
"November": "",
"December": ""
},
{
"\ufeffMeter-id": "G000002",
"Address": "Oraniendamm 10-6 , 13469 Berlin",
"January": "767,410.80",
"February": "784,932.700",
"March": "797,636.90",
"April": "812,111.000",
"May": "819,512.30",
"June": "820,482.200",
"July": "820,482.20",
"August": "820,482.200",
"September": "820,869.80",
"October": "826,243.900",
"November": "834,028.20",
"December": ""
},...
根据这一csv:
Meter-id,Address,January,February,March,April,May,June,July,August,September,October,November,December
W000001,"Groninger Straße 22 , 13347 Berlin",,,,,,,,,,,,
G000002,"Oraniendamm 10-6 , 13469 Berlin","767,410.80","784,932.700","797,636.90","812,111.000","819,512.30","820,482.200","820,482.20","820,482.200","820,869.80","826,243.900","834,028.20",
我的解析代码如下所示:
import csv
import json
csvfile = '../csv_files/metering-data.csv'
jsonfile = '../json_files/metering-data.json'
jsonArray = []
# convert csv to dict
with open(csvfile, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
for row in csvReader:
jsonArray.append(row)
# convert dict to json file
with open(jsonfile, 'w', encoding='utf-8') as jsonf:
jsonString = json.dumps(jsonArray, indent=4)
jsonf.write(jsonString)
我在这里做错什么了?
发布于 2022-01-04 16:18:35
unicode字符U+00DF是拉丁文小写字母尖S:ß
。它在json文件中正确表示为\u00df
。您唯一的问题是csv文件包含一个UTF-8字节顺序标记,这就是第一个字段名以\ufeff
开头的原因。您应该使用特殊的utf_8_sig
编码来自动删除它:
...
with open(csvfile, encoding='utf_8_sig') as csvf:
csvReader = csv.DictReader(csvf)
...
https://stackoverflow.com/questions/70581717
复制相似问题