Hi,
I am using network performance measurement tool nuttcp with IGB and E1000 cards on one of my development boards.
Issue: Observed crash/watchdog bite using testing with both IGB and E1000 cards.
Kernel: kernel_msm-3.18
Steps to reproduce.
1.) On device: ./nuttcp-8.1.4.arm -S
2.) On PC side run below command xxx.xx.xxx.xxx -> IP address of device.
./nuttcp-8.1.4.x86 -w2m -u -R 160M -i 1 -T 1m xxx.xx.xxx.xxx
3) After a couple of iterations we see the crash reported in the log.
IGB Log
------------------
Parsing debug information for MSM_DUMP_DATA_CPU_CTX. Version: 20 Magic: 42445953 Source:
Parsing CPU1 context start 171c8a800 end 171c8b000
Core 1 PC: arch_counter_get_cntvct+1c <ffffffc000a25164>
Core 1 LR: arch_counter_get_cntvct+1c <ffffffc000a25164>
[<ffffffc000a25164>] arch_counter_get_cntvct+0x1c
[<ffffffc000352ca0>] __delay+0x24
[<ffffffc000352c74>] __const_udelay+0x24
[<ffffffc00049235c>] msm_trigger_wdog_bite+0xd0
[<ffffffc0000f1d0c>] spin_bug+0x94
[<ffffffc0000f1e7c>] do_raw_spin_lock+0x104
[<ffffffc000eb38a0>] _raw_spin_lock+0x28
[<ffffffc000710768>] igb_get_stats64+0x30
[<ffffffc000cb8244>] dev_get_stats+0x4c
[<ffffffc000d249f8>] iface_stat_fmt_proc_show+0x98
[<ffffffc0001e77e0>] seq_read+0x18c
[<ffffffc000222874>] proc_reg_read+0x8c
[<ffffffc0001c5fcc>] vfs_read+0xa0
[<ffffffc0001c6758>] SyS_read+0x58
[<ffffffc0000864b0>] el0_svc_naked+0x24
From the code, it looks like it could be stuck in igb_update_stats and so the unlock might not be happening in time.
Code file:/kernel_msm-3.18/kernel/drivers/net/ethernet/intel/igb/igb_main.c
5160static struct rtnl_link_stats64 *igb_get_stats64(struct net_device *netdev,
5161 struct rtnl_link_stats64 *stats)
5162{
5163 struct igb_adapter *adapter = netdev_priv(netdev);
5164
5165 spin_lock(&adapter->stats64_lock);
5166 igb_update_stats(adapter, &adapter->stats64);
5167 memcpy(stats, &adapter->stats64, sizeof(*stats));
5168 spin_unlock(&adapter->stats64_lock);
5169
5170 return stats;
E1000 Log
-------------
[ 100.703446] init: Service 'atfwd' (pid 765) exited with status 255
[ 100.708673] init: Service 'atfwd' (pid 765) killing any children in process group
[ 104.343224] init: Untracked pid 2636 exited with status 0
[ 133.202677] BUG: spinlock lockup suspected on CPU#0, kworker/0:3/986
[ 133.208030] lock: iface_stat_list_lock+0x0/0x18, .magic: dead4ead, .owner: NetworkStats/1294, .owner_cpu: 1
[ 133.217936] Causing a watchdog bite!
[ 133.345512] Backtrace for cpu 1 (current):
[ 133.348758] CPU: 1 PID: 1294 Comm: NetworkStats Tainted: G W 3.18.31-g12d3836-dirty #2
[ 133.357697] Hardware name: Qualcomm Technologies, Inc. APQ8096v3 + PMI8994 DragonBoard (DT)
[ 133.366028] Call trace:
[ 133.368473] [<ffffffc000089d9c>] dump_backtrace+0x0/0x278
[ 133.373843] [<ffffffc00008a034>] show_stack+0x20/0x28
[ 133.378884] [<ffffffc000e70b08>] dump_stack+0x9c/0xd4
[ 133.383914] [<ffffffc000093700>] arch_trigger_all_cpu_backtrace+0x6c/0xdc
[ 133.390687] [<ffffffc0000f1e80>] do_raw_spin_lock+0x108/0x160
[ 133.396415] [<ffffffc000e7f428>] _raw_spin_lock+0x28/0x34
[ 133.401800] [<ffffffc0006d2de0>] e1000e_get_stats64+0x44/0x118
[ 133.407614] [<ffffffc000c83140>] dev_get_stats+0x4c/0xac
[ 133.412907] [<ffffffc000cef8f4>] iface_stat_fmt_proc_show+0x98/0x198
[ 133.419243] [<ffffffc0001e77e0>] seq_read+0x18c/0x3b4
[ 133.424277] [<ffffffc000222874>] proc_reg_read+0x8c/0xb4
[ 133.429571] [<ffffffc0001c5fcc>] vfs_read+0xa0/0x14c
[ 133.434520] [<ffffffc0001c6758>] SyS_read+0x58/0x94
cheers,
mohit