TCP滑动窗口消息堆积bug

问题:

客户端不能推送数据到服务端。

排查:

  • ping ip或者telnet port全是正常的,不奏效。
  • 通过wireshark抓取报文查看,发现一个奇怪现象是窗口不固定,但是整体趋势是逐渐减小,直到为0. 服务端报文如下:
15:41:29.680256 IP 110.89.84.123.1950 > 110.89.84.126.52021: Flags [.], ack 107925, win 38, options [nop,nop,TS val 1604471956 ecr 1606303303], length 0
	0x0000:  0022 462c a12f d067 e50f e893 0800 4500  ."F,./.g......E.
	0x0010:  0034 a79b 4000 4006 d417 0b0c 547b 0b0c  .4..@.@.....T{..
	0x0020:  547e 079e cb35 0c6f 535c 531b 640c 8010  T~...5.oS\S.d...
	0x0030:  0026 8383 0000 0101 080a 5fa2 4c94 5fbe  .&........_.L._.
	0x0040:  3e47                                     >G
15:41:29.719474 IP 110.89.84.123.1950 > 110.89.84.126.52021: Flags [.], ack 112269, win 5, options [nop,nop,TS val 1604471996 ecr 1606303303], length 0
	0x0000:  0022 462c a12f d067 e50f e893 0800 4500  ."F,./.g......E.
	0x0010:  0034 a79c 4000 4006 d416 0b0c 547b 0b0c  .4..@.@.....T{..
	0x0020:  547e 079e cb35 0c6f 535c 531b 7504 8010  T~...5.oS\S.u...
	0x0030:  0005 7284 0000 0101 080a 5fa2 4cbc 5fbe  ..r......._.L._.
	0x0040:  3e47                                     >G
15:41:29.934875 IP 110.89.84.126.52021 > 110.89.84.123.1950: Flags [P.], seq 112269:112909, ack 88, win 115, options [nop,nop,TS val 1606303559 ecr 1604471996], length 640
	0x0000:  d067 e50f e893 0022 462c a12f 0800 4500  .g....."F,./..E.
	0x0010:  02b4 5a89 4000 4006 1eaa 0b0c 547e 0b0c  ..Z.@.@.....T~..
	0x0020:  547b cb35 079e 531b 7504 0c6f 535c 8018  T{.5..S.u..oS\..
	0x0030:  0073 c1b7 0000 0101 080a 5fbe 3f47 5fa2  .s........_.?G_.
15:41:29.975487 IP 110.89.84.123.1950 > 110.89.84.126.52021: Flags [.], ack 116613, win 10, options [nop,nop,TS val 1604472252 ecr 1606303559], length 0
	0x0000:  0022 462c a12f d067 e50f e893 0800 4500  ."F,./.g......E.
	0x0010:  0034 a79e 4000 4006 d414 0b0c 547b 0b0c  .4..@.@.....T{..
	0x0020:  547e 079e cb35 0c6f 535c 531b 85fc 8010  T~...5.oS\S.....
	0x0030:  000a 5f87 0000 0101 080a 5fa2 4dbc 5fbe  .._......._.M._.
	0x0040:  3f47                                     ?G
15:41:30.191875 IP 110.89.84.126.52021 > 110.89.84.123.1950: Flags [P.], seq 116613:117893, ack 88, win 115, options [nop,nop,TS val 1606303816 ecr 1604472252], length 1280
	0x0000:  d067 e50f e893 0022 462c a12f 0800 4500  .g....."F,./..E.
	0x0010:  0534 5a8d 4000 4006 1c26 0b0c 547e 0b0c  .4Z.@.@..&..T~..
	0x0020:  547b cb35 079e 531b 85fc 0c6f 535c 8018  T{.5..S....oS\..
	0x0030:  0073 c437 0000 0101 080a 5fbe 4048 5fa2  .s.7......_.@H_.
	0x0040:  4dbc 2037 3435 6634 3361 3238 3334 6534  M..745f43a2834e4
	0x0050:  6465 3462 3561 3862 6630 3031 3333 6564  de4b5a8bf00133ed
	0x0060:  6462 3401 0d01 0400 0000 5308 0b10 0000  db4.......S.....                                a3
15:41:30.192523 IP 110.89.84.123.1950 > 110.89.84.126.52021: Flags [.], ack 117893, win 0, options [nop,nop,TS val 1604472469 ecr 1606303816], length 0
	0x0000:  0022 462c a12f d067 e50f e893 0800 4500  ."F,./.g......E.
	0x0010:  0034 a79f 4000 4006 d413 0b0c 547b 0b0c  .4..@.@.....T{..
	0x0020:  547e 079e cb35 0c6f 535c 531b 8afc 8010  T~...5.oS\S.....
	0x0030:  0000 58b7 0000 0101 080a 5fa2 4e95 5fbe  ..X......._.N._.
	0x0040:  4048                                     @H
15:41:30.406872 IP 110.89.84.126.52021 > 110.89.84.123.1950: Flags [.], ack 88, win 115, options [nop,nop,TS val 1606304031 ecr 1604472469], length 0
	0x0000:  d067 e50f e893 0022 462c a12f 0800 4500  .g....."F,./..E.
	0x0010:  0034 5a8e 4000 4006 2125 0b0c 547e 0b0c  .4Z.@.@.!%..T~..
	0x0020:  547b cb35 079e 531b 8afb 0c6f 535c 8010  T{.5..S....oS\..
	0x0030:  0073 bf37 0000 0101 080a 5fbe 411f 5fa2  .s.7......_.A._.
	0x0040:  4e95                                     N.
15:41:30.407143 IP 110.89.84.123.1950 > 110.89.84.126.52021: Flags [.], ack 117893, win 0, options [nop,nop,TS val 1604472683 ecr 1606303816], length 0
	0x0000:  0022 462c a12f d067 e50f e893 0800 4500  ."F,./.g......E.
	0x0010:  0034 a7a0 4000 4006 d412 0b0c 547b 0b0c  .4..@.@.....T{..
	0x0020:  547e 079e cb35 0c6f 535c 531b 8afc 8010  T~...5.oS\S.....
	0x0030:  0000 57e1 0000 0101 080a 5fa2 4f6b 5fbe  ..W......._.Ok_.
	0x0040:  4048                                     @H
  • 至此服务端一直回复服务端窗口为0,导致客户端数据无法回传到服务端。
  • 通过 netstat -ano查看服务端TCP内核的发送和接受缓冲区,发现服务端接受缓冲字节,但是一直不能发送。
[root@xdja tomcat]# netstat -ant
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State 
tcp        0      0 110.89.84.123:14468          110.89.84.33:1950            ESTABLISHED 
tcp        0      0 :::1950                     :::*                        LISTEN      
tcp   115005      0 ::ffff:110.89.84.123:1950    ::ffff:110.89.84.126:52021   ESTABLISHED 

结论:

由此可以判断,客户端一直在发数据,但是服务端处理数据整体慢于客户端发送数据,导致服务端数据积压。

解决方案:

后台修改成异步处理,如果收到TCP消息,先缓存到业务中,然后启动线程消费。

原创声明,本文系作者授权云+社区发表,未经许可,不得转载。

如有侵权,请联系 yunjia_community@tencent.com 删除。

发表于

我来说两句

0 条评论
登录 后参与评论

扫码关注云+社区

领取腾讯云代金券