前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >learning:vrrp vmac config

learning:vrrp vmac config

作者头像
dpdk-vpp源码解读
发布2023-01-04 12:45:51
7220
发布2023-01-04 12:45:51
举报
文章被收录于专栏:DPDK VPP源码分析

最近在测试vrrp功能的时候发现一个问题,就是主备同时在线的时候,在ping虚拟网关的时候,会出现下面的问题:

代码语言:javascript
复制
root@learningvpp2:~# ping 192.168.90.1
PING 192.168.90.1 (192.168.90.1) 56(84) bytes of data.
64 bytes from 192.168.90.1: icmp_seq=1 ttl=64 time=0.697 ms
64 bytes from 192.168.90.1: icmp_seq=1 ttl=63 time=3.90 ms (DUP!)
64 bytes from 192.168.90.1: icmp_seq=1 ttl=64 time=3.90 ms (DUP!)
64 bytes from 192.168.90.1: icmp_seq=1 ttl=63 time=3.90 ms (DUP!)
64 bytes from 192.168.90.1: icmp_seq=2 ttl=64 time=6.56 ms
64 bytes from 192.168.90.1: icmp_seq=2 ttl=64 time=6.56 ms (DUP!)
64 bytes from 192.168.90.1: icmp_seq=2 ttl=63 time=6.56 ms (DUP!)
64 bytes from 192.168.90.1: icmp_seq=2 ttl=63 time=6.56 ms (DUP!)

上网查询了一下原因,说是回复了重复的icmp回应报文。通过抓包发现确实存在问题回复了多个icmp relay报文地址。

怀疑是测试环境使用的VMware虚拟机使用网络连接lan区段原因,但是不确定。

还有一个原因可能是开启的混杂模式,vrrp环境中如果不开启混杂模式的话,vpp是无法收到icmp request报文的,按照作者vrrp特性描述中的说明,可能需要接口开启混杂模式,但是混杂模式下,又出现ping DUP的问题。特性描述如下:

代码语言:javascript
复制
VRRP virtual MAC address support:
 - DPDK interfaces with PMD support for multiple MAC addresses 
   rte_eth_dev_mac_addr_add(), 
   rte_eth_dev_mac_addr_del()
 - Other interfaces which are set in promiscuous mode may work

DPDK支持多mac地址对应的命令行就是:'set interface secondary-mac-address '命令允许在不改变默认MAC地址的情况下,在指定的接口上增加或删除额外的MAC地址。这可以允许发送到这些MAC地址的数据包被接收,而无需将接口设置为混杂模式。并不是所有接口都支持此操作。这样做的主要是硬件网卡,不过virtio也这样做。

vrrp就是使用上述命令行配置,但是在VMware虚拟机模式下,vpp接口无法收取报文,配置混杂模式后可以。不清楚硬件网卡是否存在此问题。

当前测试环境的时候,还存在一个问题就是当master设备从Backup状态切换会master状态时,会发送免费arp请求,当Backup收到报文后,仍然回复了arp relay报文。

代码语言:javascript
复制
04:27:59:333693: dpdk-input
  GigabitEthernet2/4/0 rx queue 0
  buffer 0x9721e: current data 0, length 60, buffer-pool 0, ref-count 1, trace handle 0x93
                  ext-hdr-valid 
  PKT MBUF: port 2, nb_segs 1, pkt_len 60
    buf_len 2176, data_len 60, ol_flags 0x0, data_off 128, phys_addr 0x367c8800
    packet_type 0x0 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0 
    rss 0x0 fdir.hi 0x0 fdir.lo 0x0
  IP4: 00:00:5e:00:01:01 -> 01:00:5e:00:00:12
  VRRP: 192.168.90.100 -> 224.0.0.18
    tos 0x00, ttl 255, length 32, checksum 0xc04e dscp CS0 ecn NON_ECN
    fragment id 0x0000
04:27:59:333722: ethernet-input
  frame: flags 0x3, hw-if-index 3, sw-if-index 3
  IP4: 00:00:5e:00:01:01 -> 01:00:5e:00:00:12
04:27:59:333734: ip4-input-no-checksum
  VRRP: 192.168.90.100 -> 224.0.0.18
    tos 0x00, ttl 255, length 32, checksum 0xc04e dscp CS0 ecn NON_ECN
    fragment id 0x0000
04:27:59:333741: vrrp4-accept-owner-input
  IPv4 sw_if_index 3 192.168.90.100 -> 224.0.0.18
04:27:59:333758: vrrp4-input
  VRRP: sw_if_index 3 IPv4
    ver 3, type 1, VRID 1, prio 200, n_addrs 1, interval 100cs, csum 0x52f0
    addresses: 192.168.90.1 
04:27:59:334451: error-drop
  rx:GigabitEthernet2/4/0
04:27:59:334471: drop
  vrrp4-input: VRRP packets processed

Packet 149

04:27:59:333693: dpdk-input
  GigabitEthernet2/4/0 rx queue 0
  buffer 0x97245: current data 0, length 60, buffer-pool 0, ref-count 1, trace handle 0x94
                  ext-hdr-valid 
  PKT MBUF: port 2, nb_segs 1, pkt_len 60
    buf_len 2176, data_len 60, ol_flags 0x0, data_off 128, phys_addr 0x367c91c0
    packet_type 0x0 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0 
    rss 0x0 fdir.hi 0x0 fdir.lo 0x0
  ARP: 00:0c:29:a2:43:f5 -> ff:ff:ff:ff:ff:ff
  request, type ethernet/IP4, address size 6/4
  00:00:5e:00:01:01/192.168.90.1 -> 00:00:00:00:00:00/192.168.90.1
04:27:59:333722: ethernet-input
  frame: flags 0x3, hw-if-index 3, sw-if-index 3
  ARP: 00:0c:29:a2:43:f5 -> ff:ff:ff:ff:ff:ff
04:27:59:333737: arp-input
  request, type ethernet/IP4, address size 6/4
  00:00:5e:00:01:01/192.168.90.1 -> 00:00:00:00:00:00/192.168.90.1
04:27:59:333751: vrrp4-arp-input
  address 0.1.8.0: vr_index 0 vr_id 1
04:27:59:334467: GigabitEthernet2/4/0-output
  GigabitEthernet2/4/0 
  ARP: 00:0c:29:07:6f:b8 -> 00:0c:29:a2:43:f5
  reply, type ethernet/IP4, address size 6/4
  00:00:5e:00:01:01/192.168.90.1 -> 00:00:5e:00:01:01/192.168.90.1
04:27:59:334474: GigabitEthernet2/4/0-tx
  GigabitEthernet2/4/0 tx queue 0
  buffer 0x97245: current data 0, length 60, buffer-pool 0, ref-count 1, trace handle 0x94
                  ext-hdr-valid 
                  l2-hdr-offset 0 l3-hdr-offset 14 
  PKT MBUF: port 2, nb_segs 1, pkt_len 60
    buf_len 2176, data_len 60, ol_flags 0x0, data_off 128, phys_addr 0x367c91c0
    packet_type 0x0 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0 
    rss 0x0 fdir.hi 0x0 fdir.lo 0x0
  ARP: 00:0c:29:07:6f:b8 -> 00:0c:29:a2:43:f5
  reply, type ethernet/IP4, address size 6/4
  00:00:5e:00:01:01/192.168.90.1 -> 00:00:5e:00:01:01/192.168.90.1

因Master设备配置了抢占模式,所以master设备接口up后,会发送vrrp通告报文和免费arp请求报文,Back设备收到后,会从master状态切换会Back状态。在代码中调用了vl_api_rpc_call_main_thread函数发送到main核来处理消息。可能就存在back设备还未从Master设备切换程back状态就收到了免费arp request报文,而回复了arp relay报文。如上面trace流程所示。

代码语言:javascript
复制
      /* signal main thread to process contents of packet */
      args0.vr_index = vr0 - vmp->vrs;
      args0.pkt = vrrp0;#可能存在问题。

      vl_api_rpc_call_main_thread (vrrp_input_process, (u8 *) &args0,
                   sizeof (args0));

个人感觉在多核模式下,还是存在问题,vrrp0是一个指针,指向vlib_buffer_t缓存区的data地方。rpc消息发送到main线程处理的时候,worker的缓存区可能都已经释放了。

vpp还提供一个接口就是set interface mac address表示变更接口的mac地址。使用接口mac地址change功能可以不必配置混杂模式解决上述问题。修改patch如下:

代码语言:javascript
复制
diff --git a/src/plugins/vrrp/node.c b/src/plugins/vrrp/node.c
index 7ba18c4f7..5ed52b4ad 100644
--- a/src/plugins/vrrp/node.c
+++ b/src/plugins/vrrp/node.c
@@ -343,6 +343,15 @@ vrrp_arp_nd_next (vlib_buffer_t * b, u32 * next_index, u32 * vr_index,
       *next_index = VRRP_ARP_INPUT_NEXT_DROP;
       return;
     }
+     if (!vr || vr->runtime.state == VRRP_VR_STATE_MASTER)
+     {
+         if (arp->ip4_over_ethernet[0].ip4.as_u32 ==
+          arp->ip4_over_ethernet[1].ip4.as_u32)
+         {
+                clib_warning("vrrp maybe switch backup,not relay.");
+               *next_index = VRRP_ARP_INPUT_NEXT_DROP;                                                                                     return;
+         }
+    }

   /* RFC 5798 section 6.4.3: Master "MUST respond" to ARP/ND. */
   eth = ethernet_buffer_get_header (b);
diff --git a/src/plugins/vrrp/vrrp.c b/src/plugins/vrrp/vrrp.c
index 8461798e0..e4f0a94ec 100644
--- a/src/plugins/vrrp/vrrp.c
+++ b/src/plugins/vrrp/vrrp.c
@@ -117,13 +117,22 @@ vrrp_vr_transition_vmac (vrrp_vr_t * vr, vrrp_vr_state_t new_state)
   /* enable only if current master vrs is 0, disable only if 0 or 1 */
   if ((enable && !n_master_vrs) || (!enable && (n_master_vrs < 2)))
     {
-      clib_warning ("%s virtual MAC address %U on hardware interface %u",
+      clib_warning ("%s virtual MAC address %U %Uon hardware interface %u",
                    (enable) ? "Adding" : "Deleting",
                    format_ethernet_address, vr->runtime.mac.bytes,
+                   format_ethernet_address, vr->runtime.hmac.bytes,
                    hw->hw_if_index);
-
-      error = vnet_hw_interface_add_del_mac_address
-       (vnm, hw->hw_if_index, vr->runtime.mac.bytes, enable);
+       if (enable)
+        {
+            memcpy_s(vr->runtime.hmac.bytes, 6, hw->hw_address, 6);
+            error = vnet_hw_interface_change_mac_address
+                    (vnm, hw->hw_if_index, vr->runtime.mac.bytes);
+        }
+       else
+        {
+            error = vnet_hw_interface_change_mac_address
+                    (vnm, hw->hw_if_index, vr->runtime.hmac.bytes);
+        }
     }

   if (error)
diff --git a/src/plugins/vrrp/vrrp.h b/src/plugins/vrrp/vrrp.h
index c93259219..3ab8beca6 100644
--- a/src/plugins/vrrp/vrrp.h
+++ b/src/plugins/vrrp/vrrp.h
@@ -98,6 +98,7 @@ typedef struct vrrp_vr_runtime
   u16 skew;
   u16 master_down_int;
   mac_address_t mac;
+  mac_address_t hmac;
   f64 last_sent;
   u32 timer_index;
 } vrrp_vr_runtime_t;
diff --git a/src/plugins/vrrp/vrrp_format.c b/src/plugins/vrrp/vrrp_format.c
index df9bf930b..521146eea 100644
--- a/src/plugins/vrrp/vrrp_format.c
+++ b/src/plugins/vrrp/vrrp_format.c
@@ -107,6 +107,8 @@ format_vrrp_vr (u8 * s, va_list * args)

   s = format (s, "   virtual MAC %U\n", format_ethernet_address,
              &vr->runtime.mac);
+  s = format (s, "   hw MAC %U\n", format_ethernet_address,
+             &vr->runtime.hmac);

   s = format (s, "   addresses %U\n", format_vrrp_vr_addrs,
              (vr->config.flags & VRRP_VR_IPV6) != 0, vr->config.vr_addrs);
diff --git a/src/plugins/vrrp/vrrp_packet.c b/src/plugins/vrrp/vrrp_packet.c
index 89a6ede60..5a79792a0 100644
--- a/src/plugins/vrrp/vrrp_packet.c
+++ b/src/plugins/vrrp/vrrp_packet.c
@@ -467,7 +467,7 @@ vrrp4_garp_pkt_build (vrrp_vr_t * vr, vlib_buffer_t * b, ip4_address_t * ip4)
   arp->opcode = clib_host_to_net_u16 (ETHERNET_ARP_OPCODE_request);
   arp->ip4_over_ethernet[0].mac = vr->runtime.mac;
   arp->ip4_over_ethernet[0].ip4 = *ip4;
-  arp->ip4_over_ethernet[1].mac = broadcast_mac;
+ // arp->ip4_over_ethernet[1].mac = broadcast_mac;
   arp->ip4_over_ethernet[1].ip4 = *ip4;
 }

当vrrp是master的时候,会将vrrp接口的hw mac地址修改未vrrp虚mac。当切换未备的时候,再恢复回来。

代码语言:javascript
复制
DBGvpp# show vrrp vr 
[0] sw_if_index 3 VR ID 1 IPv4
   state Master flags: preempt yes accept yes unicast no
   priority: configured 200 adjusted 200
   timers: adv interval 100 master adv 100 skew 21 master down 321
   virtual MAC 00:00:5e:00:01:01
   hw MAC 00:0c:29:a2:43:f5
   addresses 192.168.90.1 
   peer addresses 
   tracked interfaces 
DBGvpp# show hardware-interfaces GigabitEthernet2/4/0
              Name                Idx   Link  Hardware
GigabitEthernet2/4/0               3     up   GigabitEthernet2/4/0
  Link speed: 1 Gbps
  RX Queues:
    queue thread         mode      
    0     main (0)       polling   
  Ethernet address 00:00:5e:00:01:01
  Intel 82540EM (e1000)

但是上面存在一个问题就是一个物理口只能配置一个vrrp实例了。这样的修改就不是vrrp作者的意图。大家有更好的方案,欢迎一起讨论。

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2022-01-16,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 DPDK VPP源码分析 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档