Hello,
I wonder what small-packet throughput is possible with this setup if proper configured:
OS: SLES12-SP2
Kernel: 4.4.74-92.32-default
NIC: Intel X540-T2 Dual Port 10GBaseT
Server: Lenovo System x3550 M5, 12 CPU-Cores „Intel(R) Xeon(R) CPU E5-2643 v3 @ 3.40GHz“
smp_affinity: optimized, all 12 cores are in use by the NICs IRQs
This is a loadbalancer for DNS packets. Mainly small (~80 bytes) UDP packets.
Loadbalancer is "ipvs" in DR mode. This means the kernel does only a re-routing (MAC-rewrite) of the packet and push it back on the network (same adapter).
With this setup the server is able to process ~700kpps. At this load, all CPU cores are 100% busy (softirq).
My questions to you are, do you think this can be improved? Have we missed something?
If we direct all traffic to one single cpu core, the server is able to process already ~350kpps! Why does the system scale this bad if 12 cores are in use?
I'm grateful for every tip
Thanks
Winfried