We noticed that the TCP payload of some packets was empty (all zero) when transmitted by a virtual machine using SR-IOV an the ixgbevf driver.
This happens rarely, but when it happens, all retransmitted packets suffer from the same problem.
When we disable tx-scatter-gather on the sender, the problem never occurs.
ethtool -K eth0 sg off
In the following captures the transmitter is at the right side and the receiver at the left side. The receiving side is actually captured using a port mirror on the switch and an intermediate host to make the capture.
The TCP checksum at the transmitter is incorrect, but that's because of tx checksum offloading. The TCP checksum at the left (receiver) is what you would expect if the TCP payload would have been correct.
The TCP payload at the left only contains zeros.
In another capture, we noticed that the corrupt TCP payload appeared to contain references to kernel objects. This made us believe the issue was caused by pointing to the wrong location in memory.
System information:
Server HP ProLiant BL460c Gen8 with 32 GB RAM
Dual Intel(R) Xeon(R) CPU E5-2658 0 @ 2.10GHz
NIC: HP 560FLB based on Intel 82599
OS: CentOS release 6.8
Drivers ixgbe: 4.2.1-k and ixgbevf 2.12.1-k
The same issue was noticed on several servers including a ProLiant DL360p Gen8 with HP 2-port 561FLR-T based on Intel X540.