这已被交叉到https://forum.rocketboards.org/t/high-network-latency-after-incoming-udp-bursts/1266
我有一个应用程序在原始套接字上发送UDP数据包(手动定义ETH、IP和UDP报头),在千兆位接口(PHY ID 01410dd1)上具有高达1G的速度。目标系统:Linux4.11。
相同的应用程序在UDP套接字上接收UDP数据包,每次有数百个数据包,在一些延迟之后又收到数百个数据包,等等。
这对于较小的突发很好,但由于某种原因,同时接收太多UDP数据包会使网络接口处于奇怪的状态,我希望有人能向我解释这一点。
所有到目标的网络连接突然间都有很高的延迟,似乎只有在接收到另一个数据包时才发送数据包。点击目标显示如下:
ping -i 1 <target>
64 bytes from <target>: icmp_seq=143 ttl=64 time=1000 ms
ping -i 3 <target>
64 bytes from <target>: icmp_seq=143 ttl=64 time=3000 ms
在wireshark:
0.00:00.000 ICMP Echo (ping) request, seq=1/1024, ttl=64
0.00:01.000 ICMP Echo (ping) request, seq=2/1024, ttl=64
0.00:01.001 ICMP Echo (ping) reply, seq=1/1024, ttl=64
0.00:02.000 ICMP Echo (ping) request, seq=3/1024, ttl=64
0.00:02.001 ICMP Echo (ping) reply, seq=2/1024, ttl=64
..
对于3s的ping超时:
0.00:00.000 ICMP Echo (ping) request, seq=1/1024, ttl=64
0.00:03.000 ICMP Echo (ping) request, seq=2/1024, ttl=64
0.00:03.001 ICMP Echo (ping) reply, seq=1/1024, ttl=64
0.00:06.000 ICMP Echo (ping) request, seq=3/1024, ttl=64
0.00:06.001 ICMP Echo (ping) reply, seq=2/1024, ttl=64
..
关闭应用程序没有任何效果,会再次关闭网络接口(ifconfig eth0;ifconfig eth0 up)解决了这些问题,并使延迟恢复正常。
这会影响所有通信量,因此在ssh上登录需要很长时间,使用scp发送文件也是如此,在scp中,传输需要一段时间才能启动,但传输的吞吐量与通常相同。
top没有显示,系统是空闲的,行为正常,除了高的网络延迟。
高延迟前的ifconfig:
eth0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx
inet addr:xxx.xxx.xxx.xxx Bcast:xxx.xxx.xxx.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:3800 Metric:1
RX packets:28450933 errors:0 dropped:0 overruns:0 frame:0
TX packets:312497021 errors:134 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1991541985 (1.8 GiB) TX bytes:3038468710 (2.8 GiB)
Interrupt:27 Base address:0x6000
在高延迟期间的ifconfig:
eth0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx
inet addr:xxx.xxx.xxx.xxx Bcast:xxx.xxx.xxx.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:3800 Metric:1
RX packets:28476996 errors:0 dropped:0 overruns:0 frame:0
TX packets:312589025 errors:134 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1993344757 (1.8 GiB) TX bytes:3333157248 (3.1 GiB)
Interrupt:27 Base address:0x6000
在高延迟期间的ethtool统计数据:
mmc_tx_octetcount_gb: 295080166
mmc_tx_framecount_gb: 92109
mmc_tx_broadcastframe_g: 2
mmc_tx_multicastframe_g: 0
mmc_tx_64_octets_gb: 1459
mmc_tx_65_to_127_octets_gb: 313
mmc_tx_128_to_255_octets_gb: 94
mmc_tx_256_to_511_octets_gb: 31
mmc_tx_512_to_1023_octets_gb: 8807
mmc_tx_1024_to_max_octets_gb: 0
mmc_tx_unicast_gb: 92107
mmc_tx_multicast_gb: 0
mmc_tx_broadcast_gb: 2
mmc_tx_underflow_error: 0
mmc_tx_singlecol_g: 0
mmc_tx_multicol_g: 0
mmc_tx_deferred: 0
mmc_tx_latecol: 0
mmc_tx_exesscol: 0
mmc_tx_carrier_error: 0
mmc_tx_octetcount_g: 295080166
mmc_tx_framecount_g: 92109
mmc_tx_excessdef: 0
mmc_tx_pause_frame: 0
mmc_tx_vlan_frame_g: 0
mmc_rx_framecount_gb: 31891
mmc_rx_octetcount_gb: 2341252
mmc_rx_octetcount_g: 2341252
mmc_rx_broadcastframe_g: 26
mmc_rx_multicastframe_g: 0
mmc_rx_crc_error: 0
mmc_rx_align_error: 0
mmc_rx_run_error: 0
mmc_rx_jabber_error: 0
mmc_rx_undersize_g: 0
mmc_rx_oversize_g: 0
mmc_rx_64_octets_gb: 2243
mmc_rx_65_to_127_octets_gb: 29640
mmc_rx_128_to_255_octets_gb: 2
mmc_rx_256_to_511_octets_gb: 6
mmc_rx_512_to_1023_octets_gb: 0
mmc_rx_1024_to_max_octets_gb: 0
mmc_rx_unicast_g: 31865
mmc_rx_length_error: 0
mmc_rx_autofrangetype: 0
mmc_rx_pause_frames: 0
mmc_rx_fifo_overflow: 0
mmc_rx_vlan_frames_gb: 0
mmc_rx_watchdog_error: 0
mmc_rx_ipc_intr_mask: 1073692671
mmc_rx_ipc_intr: 0
mmc_rx_ipv4_gd: 31861
mmc_rx_ipv4_hderr: 0
mmc_rx_ipv4_nopay: 0
mmc_rx_ipv4_frag: 0
mmc_rx_ipv4_udsbl: 0
mmc_rx_ipv4_gd_octets: 1757432
mmc_rx_ipv4_hderr_octets: 0
mmc_rx_ipv4_nopay_octets: 0
mmc_rx_ipv4_frag_octets: 0
mmc_rx_ipv4_udsbl_octets: 0
mmc_rx_ipv6_gd_octets: 0
mmc_rx_ipv6_hderr_octets: 0
mmc_rx_ipv6_nopay_octets: 0
mmc_rx_ipv6_gd: 0
mmc_rx_ipv6_hderr: 0
mmc_rx_ipv6_nopay: 0
mmc_rx_udp_gd: 31788
mmc_rx_udp_err: 0
mmc_rx_tcp_gd: 0
mmc_rx_tcp_err: 0
mmc_rx_icmp_gd: 73
mmc_rx_icmp_err: 0
mmc_rx_udp_gd_octets: 1114188
mmc_rx_udp_err_octets: 0
mmc_rx_tcp_gd_octets: 0
mmc_rx_tcp_err_octets: 0
mmc_rx_icmp_gd_octets: 6024
mmc_rx_icmp_err_octets: 0
tx_underflow: 0
tx_carrier: 0
tx_losscarrier: 0
vlan_tag: 0
tx_deferred: 0
tx_vlan: 0
tx_jabber: 0
tx_frame_flushed: 0
tx_payload_error: 0
tx_ip_header_error: 0
rx_desc: 0
sa_filter_fail: 0
overflow_error: 0
ipc_csum_error: 0
rx_collision: 0
rx_crc_errors: 0
dribbling_bit: 0
rx_length: 0
rx_mii: 0
rx_multicast: 0
rx_gmac_overflow: 0
rx_watchdog: 0
da_rx_filter_fail: 0
sa_rx_filter_fail: 0
rx_missed_cntr: 0
rx_overflow_cntr: 0
rx_vlan: 0
tx_undeflow_irq: 0
tx_process_stopped_irq: 0
tx_jabber_irq: 0
rx_overflow_irq: 0
rx_buf_unav_irq: 0
rx_process_stopped_irq: 0
rx_watchdog_irq: 0
tx_early_irq: 0
fatal_bus_error_irq: 0
rx_early_irq: 19
threshold: 320
tx_pkt_n: 92109
rx_pkt_n: 26175
normal_irq_n: 3860
rx_normal_irq_n: 2426
napi_poll: 4202
tx_normal_irq_n: 1438
tx_clean: 4494
tx_set_ic_bit: 1439
irq_receive_pmt_irq_n: 0
mmc_tx_irq_n: 0
mmc_rx_irq_n: 0
mmc_rx_csum_offload_irq_n: 0
irq_tx_path_in_lpi_mode_n: 0
irq_tx_path_exit_lpi_mode_n: 0
irq_rx_path_in_lpi_mode_n: 0
irq_rx_path_exit_lpi_mode_n: 0
phy_eee_wakeup_error_n: 0
ip_hdr_err: 0
ip_payload_err: 0
ip_csum_bypassed: 0
ipv4_pkt_rcvd: 26145
ipv6_pkt_rcvd: 0
no_ptp_rx_msg_type_ext: 26145
ptp_rx_msg_type_sync: 0
ptp_rx_msg_type_follow_up: 0
ptp_rx_msg_type_delay_req: 0
ptp_rx_msg_type_delay_resp: 0
ptp_rx_msg_type_pdelay_req: 0
ptp_rx_msg_type_pdelay_resp: 0
ptp_rx_msg_type_pdelay_follow_up: 0
ptp_rx_msg_type_announce: 0
ptp_rx_msg_type_management: 0
ptp_rx_msg_pkt_reserved_type: 0
ptp_frame_type: 0
ptp_ver: 0
timestamp_dropped: 0
av_pkt_rcvd: 0
av_tagged_pkt_rcvd: 0
vlan_tag_priority_val: 0
l3_filter_match: 0
l4_filter_match: 0
l3_l4_filter_no_match: 0
irq_pcs_ane_n: 0
irq_pcs_link_n: 0
irq_rgmii_n: 0
mtl_tx_status_fifo_full: 0
mtl_tx_fifo_not_empty: 0
mmtl_fifo_ctrl: 0
mtl_tx_fifo_read_ctrl_write: 0
mtl_tx_fifo_read_ctrl_wait: 0
mtl_tx_fifo_read_ctrl_read: 0
mtl_tx_fifo_read_ctrl_idle: 0
mac_tx_in_pause: 0
mac_tx_frame_ctrl_xfer: 0
mac_tx_frame_ctrl_idle: 0
mac_tx_frame_ctrl_wait: 0
mac_tx_frame_ctrl_pause: 0
mac_gmii_tx_proto_engine: 0
mtl_rx_fifo_fill_level_full: 0
mtl_rx_fifo_fill_above_thresh: 0
mtl_rx_fifo_fill_below_thresh: 0
mtl_rx_fifo_fill_level_empty: 1
mtl_rx_fifo_read_ctrl_flush: 0
mtl_rx_fifo_read_ctrl_read_data: 0
mtl_rx_fifo_read_ctrl_status: 1
mtl_rx_fifo_read_ctrl_idle: 0
mtl_rx_fifo_ctrl_active: 0
mac_rx_frame_ctrl_fifo: 0
mac_gmii_rx_proto_engine: 0
tx_tso_frames: 0
tx_tso_nfrags: 0
民族工具的特点:
rx-checksumming: on
tx-checksumming: off
tx-checksum-ipv4: off [requested on]
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: off [requested on]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
tx-tcp-segmentation: off [fixed]
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp-mangleid-segmentation: off [fixed]
tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
ethtool -g eth0给我无法获得设备环设置:不支持操作
有人能帮我指出可能的原因/事情吗?
我正在使用内置接口,因为这是一个嵌入式系统,我不能使用外部网卡。经过一些测试后,我发现ss (netstat已经被反对,而支持ss)为udpInErrors报告了一个很高的数字(在RFC 1213中被描述为接收到的UDP数据报的数量,除了目的地端口缺少应用程序之外,其他原因是无法交付的),UdpRcvbufErrors的数量也很高(我找不到描述)。
使用iptables限制或甚至删除该端口的所有传入通信量(无论是在输入端还是在PREROUTING链上)并不能解决这个问题,而将设备配置为桥接,然后使用ebtable限制桥接流量则解决了这个问题。
这是什么意思?我假设不存在与硬件相关的问题,并且在OSI第2层和第3层之间引入了这个问题。我在哪里可以找到更多的信息或方法来分析这个问题?在哪里可以找到关于Linux内核如何处理这两个层之间的数据的详细描述?
这个问题似乎是通过切换到内核4.13来解决的,这很可能是由于DWMAC驱动程序的改变。
发布于 2017-12-20 17:28:02
您正在使用内置接口吗?1千兆通信量对于千兆接口是最大的。内置接口并不能精确地处理巨大的流量。最好像硅卡一样使用外部网卡。另外,还要注意UDP缓冲区的大小。在UDP部分是否有任何数据包接收或缓冲区错误?
netstat -anus
如果是的话,那么你需要根据你的需要增加这些尺寸。
https://unix.stackexchange.com/questions/411996
复制相似问题