Setup:
Debian 8 (Jessie)
Kernel: Linux 3.16.0-4-686-pae
IGB Driver: 5.3.5.12 (the same problem occurs with the original driver from Debian, 5.0.5-k)
After several days (between 2-20) our systems stop receiving multicast packets for certain groups they belong to. Each of these systems has two interfaces, an IGB and e1000e which are bonded together (Bonding Mode: fault-tolerance (active-backup)). The problem only ever occurs when the IGB is the Active Interface.
The switch to which the IGB interface is connected, is sending the relevant multicast packets to the Interface - this has been verified by using port-mirroring on the switch. However if I run tcpdump in non-promiscuous mode, I do not see any incoming packets for that group but I do see the outgoing IGMP Group Report.
Using tcpdump in promiscuous mode will reset the interface, which immediately fixes the problem and I will see the traffic.
Netstat always reports active membership of the intended groups:
root@vlab-210-03:~# netstat -gn
IPv6/IPv4 Group Memberships
Interface RefCnt Group
--------------- ------ ---------------------
lo 1 224.0.0.1
eth1 1 224.0.0.1
bond0 1 224.0.0.251
bond0 1 224.0.0.1
bond0 1 239.255.10.10
bond0 1 239.255.10.11
bond0 4 239.110.92.1
bond0 2 224.0.0.107
bond0 3 224.0.1.129
eth0 1 224.0.0.1
It is only ever the groups 239.255.10.10 and 239.255.10.11 which fail, and this only occurs when using the IGB interface. Other multicast traffic functions normally.
The two affected groups are carrying a large volume of video traffic.
The system is still running in a failed state, so I can query it for more information.
Thanks,
Tom.