我正在解组到一个具有一个名为Foo的time.Time
字段的结构:
type AStructWithTime struct {
Foo time.Time `json:"foo"`
}
我的期望是,在解压缩之后,我得到了这样的信息:
var expectedStruct = AStructWithTime{
Foo: time.Date(2022, 9, 26, 21, 0, 0, 0, time.UTC),
}
工作示例1:将JSON对象转化为Structs
当使用普通的json字符串时,它工作得很好:
func Test_Unmarshalling_DateTime_From_String(t *testing.T) {
jsonStrings := []string{
"{\"foo\": \"2022-09-26T21:00:00Z\"}", // trailing Z = UTC offset
"{\"foo\": \"2022-09-26T21:00:00+00:00\"}", // explicit zero offset
"{\"foo\": \"2022-09-26T21:00:00\u002b00:00\"}", // \u002b is an escaped '+'
}
for _, jsonString := range jsonStrings {
var deserializedStruct AStructWithTime
err := json.Unmarshal([]byte(jsonString), &deserializedStruct)
if err != nil {
t.Fatalf("Could not unmarshal '%s': %v", jsonString, err) // doesn't happen
}
if deserializedStruct.Foo.Unix() != expectedStruct.Foo.Unix() {
t.Fatal("Unmarshalling is erroneous") // doesn't happen
}
// works; no errors
}
}
工作示例2:将JSON转换为片
如果我将相同的对象从json数组解编组到一个片中,它也能工作:
func Test_Unmarshalling_DateTime_From_Array(t *testing.T) {
// these are just the same objects as above, just all in one array instead of as single objects/dicts
jsonArrayString := "[{\"foo\": \"2022-09-26T21:00:00Z\"},{\"foo\": \"2022-09-26T21:00:00+00:00\"},{\"foo\": \"2022-09-26T21:00:00\u002b00:00\"}]"
var slice []AStructWithTime // and now I need to unmarshal into a slice
unmarshalErr := json.Unmarshal([]byte(jsonArrayString), &slice)
if unmarshalErr != nil {
t.Fatalf("Could not unmarshal array: %v", unmarshalErr)
}
for index, instance := range slice {
if instance.Foo.Unix() != expectedStruct.Foo.Unix() {
t.Fatalf("Unmarshalling failed for index %v: Expected %v but got %v", index, expectedStruct.Foo, instance.Foo)
}
}
// works; no errors
}
不工作示例
现在,我使用从文件"test.json“读取的JSON进行同样的解组操作。其内容是上面工作示例中的数组:
[
{
"foo": "2022-09-26T21:00:00Z"
},
{
"foo": "2022-09-26T21:00:00+00:00"
},
{
"foo": "2022-09-26T21:00:00\u002b00:00"
}
]
守则是:
func Test_Unmarshalling_DateTime_From_File(t *testing.T) {
fileName := "test.json"
fileContent, readErr := os.ReadFile(filepath.FromSlash(fileName))
if readErr != nil {
t.Fatalf("Could not read file %s: %v", fileName, readErr)
}
if fileContent == nil {
t.Fatalf("File %s must not be empty", fileName)
}
var slice []AStructWithTime
unmarshalErr := json.Unmarshal(fileContent, &slice)
if unmarshalErr != nil {
// ERROR HAPPENS HERE
// Could not unmarshal file content test.json: parsing time "\"2022-09-26T21:00:00\\u002b00:00\"" as "\"2006-01-02T15:04:05Z07:00\"": cannot parse "\\u002b00:00\"" as "Z07:00"
t.Fatalf("Could not unmarshal file content %s: %v", fileName, unmarshalErr)
}
for index, instance := range slice {
if instance.Foo.Unix() != expectedStruct.Foo.Unix() {
t.Fatalf("Unmarshalling failed for index %v in file %s. Expected %v but got %v", index, fileName, expectedStruct.Foo, instance.Foo)
}
}
}
由于转义的“+”,它失败了。
解析时间“2022-09-26T21:00:00\u002b00:00”为“2006-01-02T15:04:05Z07:00”:无法将“\u002b00:00”解析为"Z07:00“
问题:为什么在从文件中读取time.Time字段时解组失败,但是当从相同的字符串读取相同的json时才有效?
发布于 2022-09-27 01:52:45
我相信这是encoding/json
中的一个bug。
https://www.json.org的JSON语法和RFC 8259,第7节:字符串上JSON的定义都提供了一个JSON字符串可能包含Unicode转义序列:
7.字符串 字符串的表示类似于C系列编程语言中使用的约定。字符串以引号开头和结尾。所有Unicode字符都可以放在引号中,但必须转义的字符除外:引号、反向solidus和控制字符(从U+0000到U+001F)。 任何角色都可以逃脱。如果字符位于基本的多语言平面(从U+0000到U+FFFF),那么它可以表示为六个字符序列:反向solidus,后面跟着小写字母u,后面跟着四个十六进制数字,编码字符的代码点。十六进制字母A到F可以是大写字母或小写字母。因此,例如,只包含一个反向solidus字符的字符串可以表示为"\u005C“。 。。。 为了转义不属于基本多语言平面的扩展字符,该字符被表示为12个字符序列,编码UTF-16代理项对。因此,例如,只包含G字符(U+1D11E)的字符串可以表示为“\ as 834\uDD1E”。 字符串=引号-标记*字符引号-标记字符=未转义/转义( %x22 /;“引号U+0022 %x5C /;\反向solidus U+005C %x2F /;/ solidus U+002F %x62 /;b backspace U+0008 %x66 /;F格式提要U+000C %x6E /;n行U+000A %x72 /;r回车返回U+000D %x74 /;t制表符U+0009 %X754HEXDIG);uXXXX U+XXXX转义= %x5C;\引号-标记= %x22;“未转义= %x20-21 / %x23-5B / %x5D-10FFFF
原始文章中的JSON文档
{
"foo": "2022-09-26T21:00:00\u002b00:00"
}
在Node.js中使用JSON.parse()
解析和反序列化非常好。
下面是一个演示该bug的示例:
package main
import (
"encoding/json"
"fmt"
"time"
)
var document []byte = []byte(`
{
"value": "2022-09-26T21:00:00\u002b00:00"
}
`)
func main() {
deserializeJsonAsTime()
deserializeJsonAsString()
}
func deserializeJsonAsTime() {
fmt.Println("")
fmt.Println("Deserializing JSON as time.Time ...")
type Widget struct {
Value time.Time `json: "value"`
}
expected := Widget{
Value: time.Date(2022, 9, 26, 21, 0, 0, 0, time.UTC),
}
actual := Widget{}
err := json.Unmarshal(document, &actual)
switch {
case err != nil:
fmt.Println("Error deserializing JSON as time.Time")
fmt.Println(err)
case actual.Value != expected.Value:
fmt.Printf("Unmarshalling failed: expected %v but got %v\n", expected.Value, actual.Value)
default:
fmt.Println("Sucess")
}
}
func deserializeJsonAsString() {
fmt.Println("")
fmt.Println("Deserializing JSON as string ...")
type Widget struct {
Value string `json: "value"`
}
expected := Widget{
Value: "2022-09-26T21:00:00+00:00",
}
actual := Widget{}
err := json.Unmarshal(document, &actual)
switch {
case err != nil:
fmt.Println("Error deserializing JSON as string")
fmt.Println(err)
case actual.Value != expected.Value:
fmt.Printf("Unmarshalling failed: expected %v but got %v\n", expected.Value, actual.Value)
default:
fmt.Println("Sucess")
}
}
当运行-查看https://goplay.tools/snippet/fHQQVJ8GfPp -我们得到:
Deserializing JSON as time.Time ...
Error deserializing JSON as time.Time
parsing time "\"2022-09-26T21:00:00\\u002b00:00\"" as "\"2006-01-02T15:04:05Z07:00\"": cannot parse "\\u002b00:00\"" as "Z07:00"
Deserializing JSON as string ...
Sucess
由于反序列化包含string
的Unicode转义序列的JSON字符串将产生正确/预期的结果--转义序列被转换为预期的rune/字节序列--问题似乎在于处理反序列化为time.Time
的代码(它似乎不将字符串反序列化为字符串,然后将字符串值解析为time.Time
)。
发布于 2022-09-29 11:37:26
正如英国人所指出的,这是时间: UnmarshalJSON不尊重转义unicode字符的一个问题。当json.Unmarshal
以这种方式转到字符串{"value": "2022-09-26T21:00:00\u002b00:00"}
时,我们可以解决这两个错误。
JSON fails when escaping '+' as '\u002b'
- Solution: Converting escaped unicode to utf8 through `strconv.Unquote`
cannot parse "\\u002b00:00\"" as "Z07:00"
- Solution: parse time with this format `"2006-01-02T15:04:05-07:00"`
- [`stdNumColonTZ // "-07:00"`](https://github.com/golang/go/blob/master/src/time/format.go#L157) from `src/time/format.go`
- If you want to parse TimeZone from it, `time.ParseInLocation` could be used.
为了使它与json.Unmarshal
兼容,我们可以定义一种新类型的utf8Time
type utf8Time struct {
time.Time
}
func (t *utf8Time) UnmarshalJSON(data []byte) error {
str, err := strconv.Unquote(string(data))
if err != nil {
return err
}
tmpT, err := time.Parse("2006-01-02T15:04:05-07:00", str)
if err != nil {
return err
}
*t = utf8Time{tmpT}
return nil
}
func (t utf8Time) String() string {
return t.Format("2006-01-02 15:04:05.999999999 -0700 MST")
}
然后做json.Unmarshal
type MyDoc struct {
Value utf8Time `json:"value"`
}
var document = []byte(`{"value": "2022-09-26T21:00:00\u002b00:00"}`)
func main() {
var mydoc MyDoc
err := json.Unmarshal(document, &mydoc)
if err != nil {
fmt.Println(err)
}
fmt.Println(mydoc.Value)
}
输出
2022-09-26 21:00:00 +0000 +0000
https://stackoverflow.com/questions/73860458
复制相似问题