Configuring SRIOV Intel E810 NIC with GitOps Zero Touch Provisioning
Configuring SRIOV Intel E810 NIC with GitOps Zero Touch Provisioning.⌗
In this tutorial, we will see the different steps to use GitOps Zero Touch Provisioning (ZTP), to configure Intel E810 NICs. First of all, we provision the server as a Single Node Openshift. SNO, it is specially designed for Telcos, to deploy their RAN workloads. These servers made use of these special cards, for high network performance. The tutorial covers:
-
Deploy an Single Node Openshift with ZTP
-
Checking your E810 available cards and ports
-
Automatically configure your SNO. Specially, the E810 cards with ZTP
-
Testing the E810 card
The SNO is an HPE DL380 server with different Intel E810 cards.
We deploy the SNO using Red Hat ZTP (Zero Touch Provisioning). Which is based on ACM (Advanced Cluster Management), ArgoCD (to sync a git repository with an Openshift/Kubernetes cluster) and a set of tools called ZTP. These tools provide two new CRs, in charge of defining deployments and configurations.
Finally, with the cluster deployed and the SRIOV cards configured, we will run some tests.
The tutorial is based on previous work done by Alberto Losada and this Red Hat tutorial. Here, the main differences are: the hardware used, and the Gitops ZTP workflow for deploying and configuring.
Deploy an SNO using ZTP SiteConfig⌗
If you are familiar with ZTP, you will see that the SiteConfig is pretty usual. It just contains the basic configuration to deploy an SNO.
If you are not familiar with ZTP, you will find more info here. But basically, you can see that SiteConfig defines your cluster deployment. Later, ArgoCD and ACM will start the cluster deployment.
---
apiVersion: ran.openshift.io/v1
kind: SiteConfig
metadata:
name: "intel-1-sno-1"
namespace: "intel-1-sno-1"
spec:
baseDomain: "hubcluster-1.lab.eng.cert.redhat.com"
pullSecretRef:
name: "assisted-deployment-pull-secret"
clusterImageSetNameRef: "img4.10.30-x86-64-appsub"
sshPublicKey: "ssh-rsa AAAAB3....gs= jgato@provisioner.el8k.hpecloud.org"
clusters:
- clusterName: "intel-1-sno-1"
clusterLabels:
common: "true"
du-profile-4.10: ""
sites : "intel-1-sno-1"
networkType: "OVNKubernetes"
clusterNetwork:
- cidr: 10.136.0.0/14
hostPrefix: 23
machineNetwork:
- cidr: 192.168.24.0/25
serviceNetwork:
- 172.31.0.0/16
additionalNTPSources:
- 192.168.24.80
nodes:
- hostName: "intel-1-sno-1.hubcluster-1.lab.eng.cert.redhat.com"
role: master
bmcAddress: redfish-virtualmedia://<BMC_IP>/redfish/v1/Systems/1
bmcCredentialsName:
name: "intel-1-sno-1-bmc-secret"
bootMACAddress: "94:40:c9:c1:eb:48"
bootMode: "UEFI"
rootDeviceHints:
deviceName: /dev/sda
nodeNetwork:
config:
interfaces:
- name: eno5
type: ethernet
state: up
ipv4:
enabled: true
dhcp: false
address:
- ip: <SERVER_IP>
prefix-length: 25
ipv6:
enabled: false
dns-resolver:
config:
server:
- <DNS>
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: <GATEWAY>
next-hop-interface: eno5
interfaces:
- name: "eno5"
macAddress: "94:40:c9:c1:eb:48"
Configure the SNO using ZTP PolicyGenTemplates⌗
After deployment, ZTP uses PolicyGenTemplates (PGT) to make different configurations. Like installing different Operators and its configurations.
Here an usual example:
apiVersion: ran.openshift.io/v1
kind: PolicyGenTemplate
metadata:
name: "common-rangen-4.10"
namespace: "ztp-common"
spec:
bindingRules:
# These policies will correspond to all clusters with this label:
common: "true"
du-profile-4.10: ""
sourceFiles:
# Create operators policies that will be installed in all clusters
- fileName: SriovSubscription.yaml
policyName: "subscriptions-policy"
- fileName: SriovSubscriptionNS.yaml
policyName: "subscriptions-policy"
- fileName: SriovSubscriptionOperGroup.yaml
policyName: "subscriptions-policy"
- fileName: PtpSubscription.yaml
policyName: "subscriptions-policy"
- fileName: PtpSubscriptionNS.yaml
policyName: "subscriptions-policy"
- fileName: PtpSubscriptionOperGroup.yaml
policyName: "subscriptions-policy"
- fileName: PaoSubscription.yaml
policyName: "subscriptions-policy"
- fileName: PaoSubscriptionNS.yaml
policyName: "subscriptions-policy"
- fileName: PaoSubscriptionOperGroup.yaml
policyName: "subscriptions-policy"
- fileName: ClusterLogNS.yaml
policyName: "subscriptions-policy"
- fileName: ClusterLogOperGroup.yaml
policyName: "subscriptions-policy"
- fileName: ClusterLogSubscription.yaml
policyName: "subscriptions-policy"
- fileName: StorageNS.yaml
policyName: "subscriptions-policy"
- fileName: StorageOperGroup.yaml
policyName: "subscriptions-policy"
- fileName: StorageSubscription.yaml
policyName: "subscriptions-policy"
- fileName: ReduceMonitoringFootprint.yaml
policyName: "config-policy"
In the following sections, we will show how to configure the PerformanceProfile and SRIOV operators. These are the keys to configure the server and the Intel E810 card, properly, to run the DPDK packet simulations.
Configuring PerformanceProfile⌗
Using this operator, we will configure a custom Performance Profile on the server. This profile covers some requirements we need, later, for our tests.
-
CPU Pinning: split CPUS for the platform and the workloads (our DPDK SRIOV Tests).
-
Hugepages configuration.
To configure the CPU Pinning, lets see our available CPUS in the server:
$> lscpu
...
...
NUMA node0 CPU(s): 0-23,48-71
NUMA node1 CPU(s): 24-47,72-95
...
...
The ‘,’ separates the CPUS and the sibling. So, hyper threading is enabled. The sibling of each CPU can be obtained:
$> cat /sys/devices/system/cpu/cpu0/topology/core_cpus_list
0,48
$> cat /sys/devices/system/cpu/cpu4/topology/core_cpus_list
4,52
$> cat /sys/devices/system/cpu/cpu12/topology/core_cpus_list
12,60
We use 8 Reserved CPU for platform processes. But having in mind, that we have to pair also the siblings:
-
Take 2 from each NUMA:
-
From NUMA 0: CPUs 0 and 1, and their siblings 48, 49
-
From NUMA 1: CPUs 24 and 25 and their siblings 72,73
-
All the other CPUs go to Isolated CPU pool. These have the lowest latency. Processes in this group have no interruptions and so can, for example, reach much higher DPDK zero packet loss bandwidth.
We also need 32 Hugepages of 1G.
The PGT to create the ‘sno-perfprofile’:
- fileName: PerformanceProfile.yaml
metadata:
annotations:
kubeletconfig.experimental: |
{"systemReserved": {"memory": "4Gi"}}
name: sno-perfprofile
policyName: perfprofile-policy
spec:
additionalKernelArgs:
- rcupdate.rcu_normal_after_boot=0
- idle=poll
cpu:
isolated: 2-23,26-47,50-71,74-95
reserved: 0,1,24,25,48,49,72,73
# NUMA node0 CPU(s): 0-23,48-71
# NUMA node1 CPU(s): 24-47,72-95
globallyDisableIrqLoadBalancing: true
hugepages:
defaultHugepagesSize: 1G
pages:
- count: 32
size: 1G
machineConfigPoolSelector:
pools.operator.machineconfiguration.openshift.io/master: ""
nodeSelector:
node-role.kubernetes.io/master: ""
numa:
topologyPolicy: single-numa-node
realTimeKernel:
enabled: true
Here the ‘,’ in Isolated and Reserved are just enumerating CPUs.
With the PGT applied, the server will be rebooted. Then, you can ssh the host to see the Performance Profile has been correctly configured:
# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-56fabc639a679b757ebae30e5f01b2ebd38e9fde9ecae91c41be41d3e89b37f8/vmlinuz-4.18.0-305.34.2.rt7.107.el8_4.x86_64 random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ignition.platform.id=metal ostree=/ostree/boot.1/rhcos/56fabc639a679b757ebae30e5f01b2ebd38e9fde9ecae91c41be41d3e89b37f8/0 ip=eno5:dhcp root=UUID=1080762e-38cc-49d9-be5c-3b5e70a157ac rw rootflags=prjquota skew_tick=1 nohz=on
rcu_nocbs=2-23,26-47,50-71,74-95
tuned.non_isolcpus=00000300,00030000,03000003 intel_pstate=disable nosoftlockup tsc=nowatchdog intel_iommu=on iommu=pt
isolcpus=managed_irq,2-23,26-47,50-71,74-95
systemd.cpu_affinity=0,1,72,73,48,49,24,25
default_hugepagesz=1G hugepagesz=1G hugepages=32 idle=poll rcupdate.rcu_normal_after_boot=0 nohz_full=4-31,36-95 crashkernel=512M
# grep HugePages_ /proc/meminfo
HugePages_Total: 32
HugePages_Free: 32
HugePages_Rsvd: 0
HugePages_Surp: 0
#
Configuring SRIOV⌗
With the Operator installed, we will use another PGT to configure our SRIOV Intel E810 cards.
Checking available SRIOV capable NICs⌗
When the SRIOV operator is installed, we can check the available NIC devices:
$> oc get sriovnetworknodestates.sriovnetwork.openshift.io -n openshift-sriov-network-operator -o yaml
apiVersion: v1
items:
- apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodeState
metadata:
creationTimestamp: "2022-08-04T13:25:08Z"
generation: 2
name: intel-1-sno-1.hubcluster-1.lab.eng.cert.redhat.com
namespace: openshift-sriov-network-operator
ownerReferences:
- apiVersion: sriovnetwork.openshift.io/v1
blockOwnerDeletion: true
controller: true
kind: SriovNetworkNodePolicy
name: default
uid: 92af4999-d2ba-4348-b306-4ed164851cfd
resourceVersion: "43108570"
uid: 0c36b30e-0fbd-4602-bda8-9cf04aee97f0
status:
interfaces:
- deviceID: "1657"
driver: tg3
linkSpeed: -1 Mb/s
linkType: ETH
mac: b4:7a:f1:dc:08:54
mtu: 1500
name: eno1
pciAddress: "0000:02:00.0"
vendor: "14e4"
- deviceID: "1657"
driver: tg3
linkSpeed: -1 Mb/s
linkType: ETH
mac: b4:7a:f1:dc:08:55
mtu: 1500
name: eno2
pciAddress: "0000:02:00.1"
vendor: "14e4"
...
...
...
- deviceID: "1593"
driver: ice
linkSpeed: -1 Mb/s
linkType: ETH
mac: b4:96:91:a3:f2:c3
mtu: 1500
name: ens1f3
pciAddress: "0000:37:00.3"
totalvfs: 64
vendor: "8086"
- deviceID: "1015"
driver: mlx5_core
linkSpeed: 25000 Mb/s
linkType: ETH
mac: 94:40:c9:c1:eb:48
mtu: 1500
name: eno5
pciAddress: 0000:5d:00.0
vendor: 15b3
- deviceID: "1015"
driver: mlx5_core
linkSpeed: 25000 Mb/s
linkType: ETH
mac: 94:40:c9:c1:eb:49
mtu: 1500
name: eno6
pciAddress: 0000:5d:00.1
vendor: 15b3
- deviceID: "1593"
driver: ice
linkSpeed: 25000 Mb/s
linkType: ETH
mac: b4:96:91:a3:ef:e9
mtu: 1500
name: ens4f1
pciAddress: 0000:86:00.1
totalvfs: 64
vendor: "8086"
...
...
...
- deviceID: "1592"
driver: ice
linkSpeed: -1 Mb/s
linkType: ETH
mac: b4:96:91:ad:83:c1
mtu: 1500
name: ens5f1
pciAddress: 0000:af:00.1
totalvfs: 128
vendor: "8086"
syncStatus: Succeeded
kind: List
metadata:
resourceVersion: ""
selfLink: ""
We are only interested in Intel and E810 NIC Cards for this tutorial.
Some IDs to locate the NICs in the Server:
-
Vendor: 8086 INTEL
-
deviceID: 1889 VFS
-
deviceID: 1593 Ethernet Controller E810-C for SFP (XXVDA4)
-
deviceID: 1592 Ethernet Controller E810-C for QSFP (CQDA4)
-
deviceID: 1657
You can get more about that on devicehunt.
We see there are multiple interfaces. For example, in this server there are 8 interfaces for the Identifier 1593 (E810-C for SFP):
$> oc get sriovnetworknodestates.sriovnetwork.openshift.io -n openshift-sriov-network-operator -o yaml \
| grep 1593 -A 9 \
| grep name
name: ens1f0
name: ens1f1
name: ens1f2
name: ens1f3
name: ens4f0
name: ens4f1
name: ens4f2
name: ens4f3
So, there really exist 2 cards (ens1 and ens4) with 4 physical ports each. Therefore, it is like 8 different NICs with their own MACs and configurations. Each one, supporting different virtual ports/interfaces (vfs) to be attached to PODs:
$> oc get sriovnetworknodestates.sriovnetwork.openshift.io -n openshift-sriov-network-operator -o yaml \
| grep 1593 -A 9 \
| grep 'name\|totalvfs'
name: ens1f0
totalvfs: 64
name: ens1f1
totalvfs: 64
name: ens1f2
totalvfs: 64
name: ens1f3
totalvfs: 64
name: ens4f0
totalvfs: 64
name: ens4f1
totalvfs: 64
name: ens4f2
totalvfs: 64
name: ens4f3
totalvfs: 64
There are another 2 cards (en2 and ens5) for intel deviceID 1592, with 2 ports each.
$> oc get sriovnetworknodestates.sriovnetwork.openshift.io -n openshift-sriov-network-operator -o yaml \
| grep 1592 -A 9 \
| grep name
name: ens2f0
name: ens2f1
name: ens5f0
name: ens5f1
But, not all the ports are wired/connected in the 1593 card (linkSpeed -1)
- deviceID: "1593"
driver: ice
linkSpeed: -1 Mb/s
linkType: ETH
mac: b4:96:91:a3:ef:ea
mtu: 1500
name: ens4f2
pciAddress: 0000:86:00.2
totalvfs: 64
vendor: "8086"
- deviceID: "1593"
driver: ice
linkSpeed: -1 Mb/s
linkType: ETH
mac: b4:96:91:a3:f2:c3
mtu: 1500
name: ens1f3
pciAddress: "0000:37:00.3"
totalvfs: 64
vendor: "8086"
But the first and second port are connected with 25000 Mb/s:
- deviceID: "1593"
driver: ice
linkSpeed: 25000 Mb/s
linkType: ETH
mac: b4:96:91:a3:ef:e8
mtu: 1500
name: ens4f0
numVfs: 8
pciAddress: 0000:86:00.0
totalvfs: 64
vendor: "8086"
- deviceID: "1593"
driver: ice
linkSpeed: 25000 Mb/s
linkType: ETH
mac: b4:96:91:a3:ef:e9
mtu: 1500
name: ens4f1
pciAddress: 0000:86:00.1
totalvfs: 64
vendor: "8086"
We can do the same for the other Intel Card with id 1592:
> oc get sriovnetworknodestates.sriovnetwork.openshift.io -n openshift-sriov-network-operator -o yaml | grep 1592 -A 9 | grep 'name\|totalvfs'
name: ens2f0
totalvfs: 128
name: ens2f1
totalvfs: 128
name: ens5f0
totalvfs: 128
name: ens5f1
totalvfs: 128
In this case, we have two cards (ens2 and ens5) with two ports each.
- deviceID: "1592"
driver: ice
linkSpeed: -1 Mb/s
linkType: ETH
mac: b4:96:91:a4:0a:78
mtu: 1500
name: ens2f0
pciAddress: "0000:12:00.0"
totalvfs: 128
vendor: "8086"
- deviceID: "1592"
driver: ice
linkSpeed: -1 Mb/s
linkType: ETH
mac: b4:96:91:a4:0a:79
mtu: 1500
name: ens2f1
pciAddress: "0000:12:00.1"
totalvfs: 128
vendor: "8086"
- deviceID: "1592"
driver: ice
linkSpeed: -1 Mb/s
linkType: ETH
mac: b4:96:91:ad:83:c0
mtu: 1500
name: ens5f0
pciAddress: 0000:af:00.0
totalvfs: 128
vendor: "8086"
- deviceID: "1592"
driver: ice
linkSpeed: -1 Mb/s
linkType: ETH
mac: b4:96:91:ad:83:c1
mtu: 1500
name: ens5f1
pciAddress: 0000:af:00.1
totalvfs: 128
vendor: "8086"
In this case, the 4 ports are disconnected. At this moment, we cannot use that card.
Creating PGT for virtual devices and networks⌗
Now we have a clear picture of the available NICs. Summary:
-
(1592 CQDA4) Ethernet Controller E810-C for QSFP
- ens2 and ens5 with all the ports disconnected
-
(1593 XXVDA4) Intel Ethernet Controller E810-C for SFP
-
ens1f0 connected
-
ens4f0 connected
-
ens4f1 connected
-
ens4f2 disconnected
-
ens4f3 disconnected
-
ens1f1 connected
-
ens1f2 disconnected
-
ens1f3 disconnected
-
We only have 4 available NICs, but we can use SRIOV and VFS to split these cards into multiple network interfaces. These VFs are exposed to be used by PODs with a network attached.
Following, we show an example to get some VFs from ens4f0/ens4f1 and create two networks. These will be the interfaces and networks used in our testing example.
---
apiVersion: ran.openshift.io/v1
kind: PolicyGenTemplate
metadata:
name: intel-1-sno-1-test-dpdk
namespace: ztp-site
spec:
bindingRules:
name: intel-1-sno-1
mcp: master
sourceFiles:
- fileName: SriovNetworkNodePolicy.yaml
metadata:
name: e810-ens4f0
policyName: sriov-netdevice-policy
spec:
deviceType: vfio-pci
nicSelector:
deviceID: "1593"
pfNames:
- ens4f0
vendor: "8086"
numVfs: 8
resourceName: e810_ens4f0
- fileName: SriovNetworkNodePolicy.yaml
metadata:
name: e810-ens4f1
policyName: sriov-netdevice-policy
spec:
deviceType: vfio-pci
nicSelector:
deviceID: "1593"
pfNames:
- ens4f1
vendor: "8086"
numVfs: 8
resourceName: e810_ens4f1
- fileName: SriovNetwork.yaml
metadata:
name: sriov-nw-du-test-pmd-e810-ens4f0
policyName: sriov-nw-test-pmd-policy
spec:
ipam: |-
{
"type": "host-local",
"ranges": [[{"subnet": "10.0.30.0/24"}]],
"dataDir":
"/run/my-orchestrator/container-ipam-state-1"
}
networkNamespace: testpmd
resourceName: e810_ens4f0
spoofChk: "off"
- fileName: SriovNetwork.yaml
metadata:
name: sriov-nw-du-test-pmd-e810-ens4f1
policyName: sriov-nw-test-pmd-policy
spec:
ipam: |-
{
"type": "host-local",
"ranges": [[{"subnet": "10.0.40.0/24"}]],
"dataDir":
"/run/my-orchestrator/container-ipam-state-1"
}
networkNamespace: testpmd
resourceName: e810_ens4f1
spoofChk: "off"
After ZTP does all the work, you will see the devices and networks created.
Network devices:
> oc -n openshift-sriov-network-operator get sriovnetworknodepolicies.sriovnetwork.openshift.io
NAME AGE
default 7d1h
e810-ens4f0 20m
e810-ens4f1 20m
The created networks:
> oc -n openshift-sriov-network-operator get sriovnetwork
NAME AGE
sriov-nw-du-test-pmd-e810-ens4f0 38m
sriov-nw-du-test-pmd-e810-ens4f1 38m
and more important, you can see the devices available in the node:
> oc get node master-0.intel-1-sno-1.hubcluster-1.lab.eng.cert.redhat.com \
-o jsonpath={.status.allocatable} | jq
{
"cpu": "88",
"ephemeral-storage": "468283204Ki",
"hugepages-1Gi": "32Gi",
"hugepages-2Mi": "0",
"memory": "157779140Ki",
"openshift.io/e810_ens4f0": "8",
"openshift.io/e810_ens4f1": "8",
"pods": "250"
}
These ‘openshift.io/e810_*’ resources will be requested later in the deployment of TestPMD and TRex (our tests).
Testing configured E810 cards⌗
In this section, we will use a DPDK test application with the configurations created in the previous sections. We will use TRex to generate traffic to a test dpdk (testpmd) application.
*We basically re-use the work done by @alosadagrande from here and this Red Hat Article. *
Important to notice, this time, I am using an Intel E810 (which is supported with Trex 3.0 used later). Previous articles would be using other SRIOV cards.

According to the picture, we need 4 different ports. These ports will not be real physical ones, these will be 4 VFS from SRIOV NICs. We have already configured this environment, and the needed VFs and Networks in a section above.
Traffic Generator will send packets from VF0 and this will be received by Testpmd VF1. Then Testpmd will forward packets to the VF2, and from there, it will reach back to Trex on the VF3.
First of all, lets download the deployments for the TestPMD and the Traffic Generator:
We clone the repository:
$> git clone https://github.com/jgato/openshift-telco.git
$> cd openshift-telco/
I am using a forked repository, where I have done some modifications adapted to this tutorial.
Running TestPMD to forward packages⌗
testPMD is an application used to test DPDK in a packet forwarding mode and also to access NIC hardware features such as Flow Director. In our case, TestPMD will basically forward packets between two ports.
Some pre-requirements:
-
PAO configuration: testPMD requires hugepages and guranteed POD.
-
SRIOV configuration:
-
Two SRIOV interfaces
-
Two networks
-
All these requirements have been fulfilled with the PGTs created in the previous versions.
Lets,edit the TestPMD deployment according to our previous configuration:
$> cd test-pmd/
$> vim manifests/deployment-testpmd-rhel.yaml
Configure the yaml to use one VF from ens4f0 and another from ens4f1. We also attach the pods to the previously created networks for these devices (‘k8s.v1.cni.cncf.io/networks’).
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: testpmd
app.kubernetes.io/component: testpmd
app.kubernetes.io/instance: testpmd
name: testpmd-rhel
namespace: testpmd
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: testpmd
template:
metadata:
labels:
app: testpmd
annotations:
k8s.v1.cni.cncf.io/networks: openshift-sriov-network-operator/sriov-nw-du-test-pmd-e810-ens4f0, openshift-sriov-network-operator/sriov-nw-du-test-pmd-e810-ens4f1
irq-load-balancing.crio.io: "disable"
cpu-load-balancing.crio.io: "disable"
spec:
runtimeClassName: "performance-performance-sno"
serviceAccount: deployer
serviceAccountName: deployer
securityContext:
runAsUser: 0
containers:
- image: registry.redhat.io/openshift4/dpdk-base-rhel8
command:
- /bin/bash
- -c
- sleep INF
securityContext:
runAsUser: 0
capabilities:
add: ["IPC_LOCK","SYS_RESOURCE","NET_RAW","NET_ADMIN"]
imagePullPolicy: Always
env:
- name: RUN_TYPE
value: "testpmd"
name: testpmd
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
resources:
limits:
cpu: "16"
hugepages-1Gi: 8Gi
memory: 2Gi
openshift.io/e810-ens4f0: "1"
openshift.io/e810-ens4f1: "1"
requests:
cpu: "16"
hugepages-1Gi: 8Gi
memory: 2Gi
openshift.io/e810-ens4f0: "1"
openshift.io/e810-ens4f1: "1"
volumeMounts:
- mountPath: /mnt/huge
name: hugepage
dnsPolicy: ClusterFirst
volumes:
- name: hugepage
emptyDir:
medium: HugePages
restartPolicy: Always
test: false
With this configuration, the POD will be connected to three networks and three network devices. Two of these devices are the SRIOV VFs on different ports of the card: ‘“pci-address”: “0000:86:09.2”’ and ‘“pci-address”: “0000:86:01.5”’
metadata:
annotations:
irq-load-balancing.crio.io: disable
k8s.ovn.org/pod-networks: '{"default":{"ip_addresses":["10.136.0.252/23"],"mac_address":"0a:58:0a:88:00:fc","gateway_ips":["10.136.0.1"],"ip_address":"10.136.0.252/23","gateway_ip":"10.136.0.1"}}'
k8s.v1.cni.cncf.io/network-status: |-
[{
"name": "ovn-kubernetes",
"interface": "eth0",
"ips": [
"10.136.0.252"
],
"mac": "0a:58:0a:88:00:fc",
"default": true,
"dns": {}
},{
"name": "testpmd/sriov-nw-du-test-pmd-e810-ens4f0",
"interface": "net1",
"ips": [
"10.0.30.8"
],
"dns": {},
"device-info": {
"type": "pci",
"version": "1.0.0",
"pci": {
"pci-address": "0000:86:01.5"
}
}
},{
"name": "testpmd/sriov-nw-du-test-pmd-e810-ens4f1",
"interface": "net2",
"ips": [
"10.0.40.8"
],
"dns": {},
"device-info": {
"type": "pci",
"version": "1.0.0",
"pci": {
"pci-address": "0000:86:09.2"
}
}
}]
The Deployment will create a Pod with the tools to use a DPDK test application. This test application will just forward what it receives by its Port 0 to its Port 1. Port 0 and 1 are the VFS 1 and 2 in the figure above.
The test application will also collect metrics about the packets received/transmitted.
> oc apply -f manifests/deployment-testpmd-rhel.yaml
deployment.apps/testpmd-rhel created
The Deployment would take some time, in order to create the ’network-attachment-definitions’. These are created because of the ‘SriovNetwork’ created previously. These were configured to point to the ’testpmd’ namespace.
Once the Deployment is created, we can rsh there.
To run the testpmd is important to define two different peers. It will forward only what it came from peer 50:00:00:00:00:01 to peer 50:00:00:00:00:02. Later, we will configure TREX Ports with these MAC addresses:
$> oc rsh testpmd-rhel-6789767d85-tx4z8
sh-4.4# export CPU=$(cat /sys/fs/cgroup/cpuset/cpuset.cpus)
sh-4.4# testpmd -l ${CPU} -a ${PCIDEVICE_OPENSHIFT_IO_E810_ENS4F0} -a ${PCIDEVICE_OPENSHIFT_IO_E810_ENS4F1} \
-n 4 -- -i --nb-cores=15 --rxd=4096 --txd=4096 --rxq=7 --txq=7 --forward-mode=mac \
--eth-peer=0,50:00:00:00:00:01 --eth-peer=1,50:00:00:00:00:02
EAL: Detected 96 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: No available hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: using IOMMU type 1 (Type 1)
EAL: Probe PCI driver: net_iavf (8086:1889) device: 0000:37:01.2 (socket 0)
EAL: Probe PCI driver: net_iavf (8086:1889) device: 0000:37:09.2 (socket 0)
EAL: No legacy callbacks, legacy socket not created
Interactive-mode selected
Set mac packet forwarding mode
testpmd: create a new mbuf pool <mb_pool_0>: n=267456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[0]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[1]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[2]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[3]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[4]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[5]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[6]
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: link state change event
Port 0: 62:B9:05:82:48:DB
Configuring Port 1 (socket 0)
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[0]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[1]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[2]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[3]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[4]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[5]
iavf_configure_queues(): RXDID[1] is not supported, request default RXDID[1] in Queue[6]
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: link state change event
Port 1: 4E:88:D9:B9:80:31
Checking link statuses...
Done
testpmd>
Set promiscuous mode off:
testpmd> set promisc all off
Everything seems ok. But no many traffic there yet:
testpmd> start
mac packet forwarding - ports=2 - cores=14 - streams=14 - NUMA support enabled, MP allocation mode: native
Logical Core 5 (socket 0) forwards packets on 1 streams:
RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=50:00:00:00:00:02
Logical Core 6 (socket 0) forwards packets on 1 streams:
RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=50:00:00:00:00:01
Logical Core 7 (socket 0) forwards packets on 1 streams:
RX P=0/Q=1 (socket 0) -> TX P=1/Q=1 (socket 0) peer=50:00:00:00:00:02
Logical Core 8 (socket 0) forwards packets on 1 streams:
RX P=1/Q=1 (socket 0) -> TX P=0/Q=1 (socket 0) peer=50:00:00:00:00:01
Logical Core 9 (socket 0) forwards packets on 1 streams:
RX P=0/Q=2 (socket 0) -> TX P=1/Q=2 (socket 0) peer=50:00:00:00:00:02
Logical Core 10 (socket 0) forwards packets on 1 streams:
RX P=1/Q=2 (socket 0) -> TX P=0/Q=2 (socket 0) peer=50:00:00:00:00:01
Logical Core 11 (socket 0) forwards packets on 1 streams:
RX P=0/Q=3 (socket 0) -> TX P=1/Q=3 (socket 0) peer=50:00:00:00:00:02
Logical Core 52 (socket 0) forwards packets on 1 streams:
RX P=1/Q=3 (socket 0) -> TX P=0/Q=3 (socket 0) peer=50:00:00:00:00:01
Logical Core 53 (socket 0) forwards packets on 1 streams:
RX P=0/Q=4 (socket 0) -> TX P=1/Q=4 (socket 0) peer=50:00:00:00:00:02
Logical Core 54 (socket 0) forwards packets on 1 streams:
RX P=1/Q=4 (socket 0) -> TX P=0/Q=4 (socket 0) peer=50:00:00:00:00:01
Logical Core 55 (socket 0) forwards packets on 1 streams:
RX P=0/Q=5 (socket 0) -> TX P=1/Q=5 (socket 0) peer=50:00:00:00:00:02
Logical Core 56 (socket 0) forwards packets on 1 streams:
RX P=1/Q=5 (socket 0) -> TX P=0/Q=5 (socket 0) peer=50:00:00:00:00:01
Logical Core 57 (socket 0) forwards packets on 1 streams:
RX P=0/Q=6 (socket 0) -> TX P=1/Q=6 (socket 0) peer=50:00:00:00:00:02
Logical Core 58 (socket 0) forwards packets on 1 streams:
RX P=1/Q=6 (socket 0) -> TX P=0/Q=6 (socket 0) peer=50:00:00:00:00:01
mac packet forwarding packets/burst=32
nb forwarding cores=15 - nb forwarding ports=2
port 0: RX queue number: 7 Tx queue number: 7
Rx offloads=0x0 Tx offloads=0x10000
RX queue: 0
RX desc=4096 - RX free threshold=32
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=4096 - TX free threshold=32
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX offloads=0x10000 - TX RS bit threshold=32
port 1: RX queue number: 7 Tx queue number: 7
Rx offloads=0x0 Tx offloads=0x10000
RX queue: 0
RX desc=4096 - RX free threshold=32
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=4096 - TX free threshold=32
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX offloads=0x10000 - TX RS bit threshold=32
testpmd> show port stats all
######################## NIC statistics for port 0 ########################
RX-packets: 258 RX-missed: 0 RX-bytes: 24666
RX-errors: 0
RX-nombuf: 0
TX-packets: 7 TX-errors: 0 TX-bytes: 2292
Throughput (since last show)
Rx-pps: 0 Rx-bps: 0
Tx-pps: 0 Tx-bps: 0
############################################################################
######################## NIC statistics for port 1 ########################
RX-packets: 258 RX-missed: 0 RX-bytes: 24666
RX-errors: 0
RX-nombuf: 0
TX-packets: 7 TX-errors: 0 TX-bytes: 2292
Throughput (since last show)
Rx-pps: 0 Rx-bps: 0
Tx-pps: 0 Tx-bps: 0
############################################################################
We need to initiate the Traffic Generator to see how packets are received and forwarded.
Running Trex to create some traffic⌗
We use TRex as a traffic generator.
From the previous cloned git repository:
$> cd trex
$> vim pods/trex.yaml
and edit the ’trex.yaml’ to configure the traffic generator
It is important to configure the Networks according to the peers (MAC addresses) configured on the Testpmd. ’trex.yaml`
apiVersion: v1
kind: Pod
metadata:
annotations:
k8s.v1.cni.cncf.io/networks: '[
{
"name": "sriov-nw-du-test-pmd-e810-ens4f0",
"mac": "50:00:00:00:00:01",
"namespace": "testpmd"
},
{
"name": "sriov-nw-du-test-pmd-e810-ens4f1",
"mac": "50:00:00:00:00:02",
"namespace": "testpmd"
}
]'
The ‘mac_telc0’ and ‘mac_telco1’ variables on the part of the script ’testpmd_addr.py’ , again, with the proper MAC addresses.
IPs are not important here, we are only using MACs
testpmd_addr.py: |
# wild second XL710 mac
mac_telco0 = '50:00:00:00:00:01'
# we don’t care of the IP in this phase
ip_telco0 = '10.0.0.1'
# wild first XL710 mac
mac_telco1 = '50:00:00:00:00:02'
ip_telco1 = '10.1.1.1'
With everything configured, create the trex deployment:
$> oc -n testpmd apply -f pods/trex.yaml
configmap/trex-info-for-config created
configmap/trex-config-template created
configmap/trex-tests created
pod/trex created
Now, we will configure the ‘/etc/trex_cfg.yaml’.
Together with the deployment there is on script trying to make configuration automatically. But it seems is failing about detecting the ’latency_thread’, the socket and interfaces. So, double-check everything manually.
Mainly we need info about the CPUS and the Network interfaces:
$> oc -n testpmd rsh trex
sh-5.1# echo $PCIDEVICE_OPENSHIFT_IO_E810_ENS4F0
0000:86:01.5
sh-5.1# echo $PCIDEVICE_OPENSHIFT_IO_E810_ENS4F1
0000:86:09.6
sh-5.1# cat /sys/fs/cgroup/cpuset/cpuset.cpus
26-33,74-81
Use this information to double-check ‘/etc/trex_cfg.yaml’:
sh-4.4# cat /etc/trex_cfg.yaml
- port_limit: 2
version: 2
interfaces: ["0000:86:09.6","0000:86:01.5"]
port_bandwidth_gb: 25
port_info:
- ip: 10.10.10.2
default_gw: 10.10.10.1
- ip: 10.10.20.2
default_gw: 10.10.20.1
platform:
master_thread_id: 26
latency_thread_id: 74
dual_if:
- socket: 1
threads: [27,28,29,30,31,32,33,75,76,77,78,79,80,81]
Check the interfaces belongs to the PCI addressees of our VFs.
From the list of CPUs assigned to the Pod, we get the first CPU (26) and its sibling (74) as master/latency threads. We put these two CPUs out of the thread list.
Siblings can be obtained from inside the pod:
sh-5.1# cat /sys/devices/system/cpu/cpu26/topology/core_cpus_list
26,74
Important: Check the socket used by the CPUs and Network interfaces. In this example, all belong to socket: 1.
sh-5.1# lscpu -e
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE
0 0 0 0 0:0:0:0 yes
...
...
24 1 1 24 32:32:32:1 yes
25 1 1 25 33:33:33:1 yes
26 1 1 26 34:34:34:1 yes
27 1 1 27 35:35:35:1 yes
28 1 1 28 36:36:36:1 yes
29 1 1 29 37:37:37:1 yes
30 1 1 30 38:38:38:1 yes
31 1 1 31 41:41:41:1 yes
32 1 1 32 42:42:42:1 yes
33 1 1 33 43:43:43:1 yes
34 1 1 34 44:44:44:1 yes
35 1 1 35 45:45:45:1 yes
36 1 1 36 48:48:48:1 yes
37 1 1 37 49:49:49:1 yes
38 1 1 38 50:50:50:1 yes
39 1 1 39 51:51:51:1 yes
40 1 1 40 52:52:52:1 yes
...
...
94 1 1 46 60:60:60:1 yes
95 1 1 47 61:61:61:1 yes
sh-5.1#
sh-5.1# lspci -v -nn -mm -k -s ${PCIDEVICE_OPENSHIFT_IO_E810_ENS4F0}
Slot: 86:01.5
Class: Ethernet controller [0200]
Vendor: Intel Corporation [8086]
Device: Ethernet Adaptive Virtual Function [1889]
SVendor: Intel Corporation [8086]
SDevice: Device [0000]
Rev: 02
Driver: vfio-pci
Module: iavf
NUMANode: 1
IOMMUGroup: 192
sh-5.1# lspci -v -nn -mm -k -s ${PCIDEVICE_OPENSHIFT_IO_E810_ENS4F1}
Slot: 86:09.6
Class: Ethernet controller [0200]
Vendor: Intel Corporation [8086]
Device: Ethernet Adaptive Virtual Function [1889]
SVendor: Intel Corporation [8086]
SDevice: Device [0000]
Rev: 02
Driver: vfio-pci
Module: iavf
NUMANode: 1
IOMMUGroup: 201
With everything configured, lets run trex server:
sh-5.1# ./t-rex-64 --no-ofed-check --no-hw-flow-stat -i -c 14
Starting Scapy server..... Scapy server is started
The ports are bound/configured.
Starting TRex v3.00 please wait ...
iavf_execute_vf_cmd(): Cmd 26 not supported
iavf_set_hena(): Failed to execute command of OP_SET_RSS_HENA
iavf_execute_vf_cmd(): Cmd 26 not supported
iavf_set_hena(): Failed to execute command of OP_SET_RSS_HENA
set driver name net_iavf
driver capability : TCP_UDP_OFFLOAD TSO SLRO
set dpdk queues mode to MULTI_QUE
Number of ports found: 2
zmq publisher at: tcp://*:4500
wait 1 sec .
port : 0
------------
link : link : Link Up - speed 25000 Mbps - full-duplex
promiscuous : 0
port : 1
------------
link : link : Link Up - speed 25000 Mbps - full-duplex
promiscuous : 0
number of ports : 2
max cores for 2 ports : 14
tx queues per port : 16
-------------------------------
RX core uses TX queue number 65535 on all ports
core, c-port, c-queue, s-port, s-queue, lat-queue
------------------------------------------
1 0 0 1 0 0
2 0 1 1 1 255
3 0 2 1 2 255
4 0 3 1 3 255
5 0 4 1 4 255
6 0 5 1 5 255
7 0 6 1 6 255
8 0 7 1 7 255
9 0 8 1 8 255
10 0 9 1 9 255
11 0 10 1 10 255
12 0 11 1 11 255
13 0 12 1 12 255
14 0 13 1 13 255
-------------------------------
after some secs you will see something like:

Now, it is time to open a new console to run some packet’s simulation:
> oc rsh trex
sh-5.1# ./trex-console
Using 'python3' as Python interpeter
Connecting to RPC server on localhost:4501 [SUCCESS]
Connecting to publisher server on localhost:4500 [SUCCESS]
Acquiring ports [0, 1]: [SUCCESS]
*** Warning - Port 0 destination is unresolved ***
*** Warning - Port 1 destination is unresolved ***
Server Info:
Server version: v3.00 @ STL
Server mode: Stateless
Server CPU: 14 x Intel(R) Xeon(R) Gold 5220R CPU @ 2.20GHz
Ports count: 2 x 25.0Gbps @ Ethernet Adaptive Virtual Function
-=TRex Console v3.0=-
Type 'help' or '?' for supported actions
Execute ’tui’ command to enter in a visual/dashboard with statistics of what is happening on the ports. Not very much should be happening yet.
Together with the deployment, it was included a python application that will make a packet simulation. Lets run the packet simulation on port 0, sending 1 million packets per second.
tui> start -f /opt/tests/testpmd.py -m 1mpps -p0
What is happening now?
The test application is sending packets through Trex Port 0, which is sending the packages to Testpmd Port 0. Then, Testpmd is configured to forward everything to Testpmd Port 1. From there, the packages are received by Trex Port 1.

You can see how Port 0 out-packets and Port 1 input-packets are mostly the same. Testpmd is doing ports forward correctly and the card seems to be working at the expected speed: 25Gb/s.
Summary⌗
For this tutorial, we have been working with the Intel E810 card. Pretty new card, which would not be compatible yet with all the environments. Using SNO4.10 and TRex 3.0 seems to be working properly. But it is important to notice, here, we only did a packets simulation (testing). All the real capabilities would not be supported.
To deploy and configure an SNO server, to run 5G RAN workloads, can be complex, specially the first time until you reach the proper configuration. Without a tool/methodology like ZTP GitOps this is not only complex, it would be difficult to use at scale.
In this tutorial, we have show an example to configure and have the server Ready to run DPDK applications (base for 5G RAN vDUs). After the testing, the configuration can be replicable and scalable in any number of different servers.
Once that you have a Siteconfig defining your first SNO, this can be replicated in your GitOps repo. Just needing to adapt the different IPs,MACs, and nodes naming. After that, the tuned PolicyGenTemplates will be applied configuring all the SNO.
In day-2 operations, if the configurations needs to be updated, this can be also done with the ZTP GitOps approach. Just making a change on a PolicyGenTemplate definition, it will automatically make the changes on all the selected servers. Modifying hundreds of servers is just a matter of making a new commit.
ToDo and future work⌗
In a second tutorial, we will go further about the different SCCs needed for running DPDK, and we will play with the different speeds of the card.