github.com/k8snetworkplumbingwg/sriov-network-operator@v1.2.1-0.20240408194816-2d2e5a45d453/doc/ovs-hw-offload.md (about)

     1  # OVS Hardware Offload
     2  
     3  The OVS software based solution is CPU intensive, affecting system performance
     4  and preventing fully utilizing available bandwidth. OVS 2.8 and above support
     5  a feature called OVS Hardware Offload which improves performance significantly.
     6  This feature allows offloading the OVS data-plane to the NIC while maintaining
     7  OVS control-plane unmodified. It is using SR-IOV technology with VF representor
     8  host net-device. The VF representor plays the same role as TAP devices
     9  in Para-Virtual (PV) setup. A packet sent through the VF representor on the host
    10  arrives to the VF, and a packet sent through the VF is received by its representor.
    11  
    12  ## Supported Ethernet controllers
    13  
    14  The following manufacturers are known to work:
    15  
    16  - Mellanox ConnectX-5 and above
    17  
    18  ## Instructions for Mellanox ConnectX-5
    19  
    20  ## Prerequisites
    21  
    22  - OpenVswitch installed
    23  - Network Manager installed
    24  
    25  ### Deploy SriovNetworkNodePolicy
    26  
    27  ```yaml
    28  apiVersion: sriovnetwork.openshift.io/v1
    29  kind: SriovNetworkNodePolicy
    30  metadata:
    31    name: ovs-hw-offload
    32    namespace: sriov-network-operator
    33  spec:
    34    deviceType: netdevice
    35    nicSelector:
    36      deviceID: "1017"
    37      rootDevices:
    38      - 0000:02:00.0
    39      - 0000:02:00.1
    40      vendor: "15b3"
    41    nodeSelector:
    42      feature.node.kubernetes.io/network-sriov.capable: "true"
    43    numVfs: 8
    44    priority: 10
    45    resourceName: cx5_sriov_switchdev
    46    isRdma: true
    47    eSwitchMode: switchdev
    48    linkType: eth
    49  ```
    50  
    51  ### Create NetworkAttachmentDefinition CRD with OVS CNI config
    52  
    53  ```yaml
    54  apiVersion: "k8s.cni.cncf.io/v1"
    55  kind: NetworkAttachmentDefinition
    56  metadata:
    57    name: ovs-net
    58    annotations:
    59      k8s.v1.cni.cncf.io/resourceName: openshift.io/cx5_sriov_switchdev
    60  spec:
    61    config: '{
    62        "cniVersion": "0.3.1",
    63        "type": "ovs",
    64        "bridge": "br-sriov0",
    65        "vlan": 10
    66      }'
    67  ```
    68  
    69  ### Deploy POD with OVS hardware-offload
    70  
    71  Create POD spec and request a VF
    72  
    73  ```yaml
    74  apiVersion: v1
    75  kind: Pod
    76  metadata:
    77    name: ovs-offload-pod1
    78    annotations:
    79      k8s.v1.cni.cncf.io/networks: ovs-net
    80  spec:
    81    containers:
    82    - name: ovs-offload
    83      image: networkstatic/iperf3
    84      resources:
    85        requests:
    86          openshift.io/cx5_sriov_switchdev: '1'
    87        limits:
    88          openshift.io/cx5_sriov_switchdev: '1'
    89      command:
    90      - sh
    91      - -c
    92      - |
    93        ls -l /dev/infiniband /sys/class/net
    94        sleep 1000000
    95  ```
    96  
    97  ## Verify Hardware-Offloads is Working
    98  
    99  Run iperf3 server on POD 1
   100  
   101  ```bash
   102  kubectl exec -it ovs-offload-pod1 -- iperf3 -s
   103  ```
   104  
   105  Run iperf3 client on POD 2
   106  
   107  ```bash
   108  kubectl exec -it ovs-offload-pod2 -- iperf3 -c 192.168.1.17 -t 100
   109  ```
   110  
   111  Check traffic on the VF representor port. Verify only TCP connection establishment appears
   112  
   113  ```text
   114  tcpdump -i enp3s0f0_3 tcp
   115  listening on enp3s0f0_3, link-type EN10MB (Ethernet), capture size 262144 bytes
   116  22:24:44.969516 IP 192.168.1.16.43558 > 192.168.1.17.targus-getdata1: Flags [S], seq 89800743, win 64860, options [mss 1410,sackOK,TS val 491087056 ecr 0,nop,wscale 7], length 0
   117  22:24:44.969773 IP 192.168.1.17.targus-getdata1 > 192.168.1.16.43558: Flags [S.], seq 1312764151, ack 89800744, win 64308, options [mss 1410,sackOK,TS val 4095895608 ecr 491087056,nop,wscale 7], length 0
   118  22:24:45.085558 IP 192.168.1.16.43558 > 192.168.1.17.targus-getdata1: Flags [.], ack 1, win 507, options [nop,nop,TS val 491087222 ecr 4095895608], length 0
   119  22:24:45.085592 IP 192.168.1.16.43558 > 192.168.1.17.targus-getdata1: Flags [P.], seq 1:38, ack 1, win 507, options [nop,nop,TS val 491087222 ecr 4095895608], length 37
   120  22:24:45.086311 IP 192.168.1.16.43560 > 192.168.1.17.targus-getdata1: Flags [S], seq 3802331506, win 64860, options [mss 1410,sackOK,TS val 491087279 ecr 0,nop,wscale 7], length 0
   121  22:24:45.086462 IP 192.168.1.17.targus-getdata1 > 192.168.1.16.43560: Flags [S.], seq 441940709, ack 3802331507, win 64308, options [mss 1410,sackOK,TS val 4095895725 ecr 491087279,nop,wscale 7], length 0
   122  22:24:45.086624 IP 192.168.1.16.43560 > 192.168.1.17.targus-getdata1: Flags [.], ack 1, win 507, options [nop,nop,TS val 491087279 ecr 4095895725], length 0
   123  22:24:45.086654 IP 192.168.1.16.43560 > 192.168.1.17.targus-getdata1: Flags [P.], seq 1:38, ack 1, win 507, options [nop,nop,TS val 491087279 ecr 4095895725], length 37
   124  22:24:45.086715 IP 192.168.1.17.targus-getdata1 > 192.168.1.16.43560: Flags [.], ack 38, win 503, options [nop,nop,TS val 4095895725 ecr 491087279], length 0
   125  ```
   126  
   127  Check datapath rules are offloaded
   128  
   129  ```text
   130  ovs-appctl dpctl/dump-flows --names type=offloaded
   131  recirc_id(0),in_port(eth0),eth(src=16:fd:c6:0b:60:52),eth_type(0x0800),ipv4(src=192.168.1.17,frag=no), packets:2235857, bytes:147599302, used:0.550s, actions:ct(zone=65520),recirc(0x18)
   132  ct_state(+est+trk),ct_mark(0),recirc_id(0x18),in_port(eth0),eth(dst=42:66:d7:45:0d:7e),eth_type(0x0800),ipv4(dst=192.168.1.0/255.255.255.0,frag=no), packets:2235857, bytes:147599302, used:0.550s, actions:eth1
   133  recirc_id(0),in_port(eth1),eth(src=42:66:d7:45:0d:7e),eth_type(0x0800),ipv4(src=192.168.1.16,frag=no), packets:133410141, bytes:195255745684, used:0.550s, actions:ct(zone=65520),recirc(0x16)
   134  ct_state(+est+trk),ct_mark(0),recirc_id(0x16),in_port(eth1),eth(dst=16:fd:c6:0b:60:52),eth_type(0x0800),ipv4(dst=192.168.1.0/255.255.255.0,frag=no), packets:133410138, bytes:195255745483, used:0.550s, actions:eth0
   135  ```