In the current 5.8 series, if WiFi firmware crashes, itās game over. You lose, reload the module (can be hard to do without WiFi)!
Broadcom engineers introduced a commit for 5.9 which will reset the firmware when it crashes automatically. However, I saw this commit caused an kernel BUG that made the WiFi just as, if not more, inaccessible:
[Mon Sep 7 13:20:57 2020] ieee80211 phy0: brcmf_fw_crashed: Firmware has halted or crashed
[Mon Sep 7 13:20:57 2020] ieee80211 phy0: brcmf_fil_cmd_data: bus is down. we have nothing to do.
[Mon Sep 7 13:20:57 2020] ieee80211 phy0: brcmf_cfg80211_get_tx_power: error (-5)
[Mon Sep 7 13:20:58 2020] brcmfmac: brcmf_sdiod_probe: Failed to set F1 blocksize
[Mon Sep 7 13:20:58 2020] ------------[ cut here ]------------
[Mon Sep 7 13:20:58 2020] Kernel BUG at __slab_free.constprop.0+0x142/0x280 [verbose debug info unavailable]
[Mon Sep 7 13:20:58 2020] invalid opcode: 0000 [#1] SMP PTI
[Mon Sep 7 13:20:58 2020] CPU: 3 PID: 32 Comm: kworker/3:1 Tainted: G E 5.8.7 #1
[Mon Sep 7 13:20:58 2020] Workqueue: events brcmf_core_bus_reset [brcmfmac]
[Mon Sep 7 13:20:58 2020] RIP: 0010:__slab_free.constprop.0+0x142/0x280
[Mon Sep 7 13:20:58 2020] Code: 18 e9 04 ff ff ff 9c 5f fa f0 49 0f ba 2c 24 00 72 71 4d 3b 6c 24 20 74 13 49 0f ba 34 24 00 57 9d eb aa 80 4c 24 4b 80 eb 8b <0f> 0b 49 3b 54 24 28 75 e6 49 89 5c 24 20 49 89 4c 24 28 49 0f ba
[Mon Sep 7 13:20:58 2020] RSP: 0000:ffffb92f8015bd80 EFLAGS: 00010246
[Mon Sep 7 13:20:58 2020] RAX: ffff8f55f7f5b930 RBX: ffff8f55f7f5b900 RCX: ffff8f55f7f5b900
[Mon Sep 7 13:20:58 2020] RDX: 00000000802a001e RSI: fffffcf781dfd6c0 RDI: ffff8f558008d500
[Mon Sep 7 13:20:58 2020] RBP: ffffb92f8015be10 R08: 0000000000000001 R09: ffff8f55f7f5b900
[Mon Sep 7 13:20:58 2020] R10: 0000000000005d34 R11: 0000000000000008 R12: fffffcf781dfd6c0
[Mon Sep 7 13:20:58 2020] R13: ffff8f55f7f5b900 R14: ffff8f558008d500 R15: 0000000000000000
[Mon Sep 7 13:20:58 2020] FS: 0000000000000000(0000) GS:ffff8f55fad80000(0000) knlGS:0000000000000000
[Mon Sep 7 13:20:58 2020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mon Sep 7 13:20:58 2020] CR2: 00007fd5680dec00 CR3: 0000000073e82000 CR4: 00000000001006e0
[Mon Sep 7 13:20:58 2020] Call Trace:
[Mon Sep 7 13:20:58 2020] ? recalibrate_cpu_khz+0x10/0x10
[Mon Sep 7 13:20:58 2020] ? ktime_get_mono_fast_ns+0x49/0x90
[Mon Sep 7 13:20:58 2020] ? rpm_suspend+0x8e/0x540
[Mon Sep 7 13:20:58 2020] ? __wake_up_common_lock+0x86/0xc0
[Mon Sep 7 13:20:58 2020] kfree+0x1a8/0x1c0
[Mon Sep 7 13:20:58 2020] brcmf_sdiod_remove+0x3b/0xa0 [brcmfmac]
[Mon Sep 7 13:20:58 2020] brcmf_sdiod_probe+0x157/0x1e0 [brcmfmac]
[Mon Sep 7 13:20:58 2020] brcmf_sdio_bus_reset+0x4e/0x90 [brcmfmac]
[Mon Sep 7 13:20:58 2020] process_one_work+0x185/0x2e0
[Mon Sep 7 13:20:58 2020] worker_thread+0x4c/0x3a0
[Mon Sep 7 13:20:58 2020] ? rescuer_thread+0x370/0x370
[Mon Sep 7 13:20:58 2020] kthread+0x113/0x130
[Mon Sep 7 13:20:58 2020] ? __kthread_bind_mask+0x60/0x60
[Mon Sep 7 13:20:58 2020] ret_from_fork+0x22/0x30
I narrowed down the kernel BUG during reset after firmware crash to a double free that can occur under certain conditions, and was able to fix it! Now when the WiFi firmware crashes, it successfully resets itself. I hope to get this commit upstream. For now, itās included in the patch set for 5.8.
[Mon Sep 7 15:43:37 2020] ieee80211 phy0: brcmf_fw_crashed: Firmware has halted or crashed
[Mon Sep 7 15:43:37 2020] ieee80211 phy0: brcmf_fil_cmd_data: bus is down. we have nothing to do.
[Mon Sep 7 15:43:37 2020] ieee80211 phy0: brcmf_cfg80211_get_tx_power: error (-5)
[Mon Sep 7 15:43:37 2020] ieee80211 phy0: brcmf_netdev_start_xmit: xmit rejected state=0
[Mon Sep 7 15:43:38 2020] brcmfmac: brcmf_sdiod_probe: Failed to set F1 blocksize
[Mon Sep 7 15:43:38 2020] brcmfmac: brcmf_sdio_bus_reset: Failed to probe after sdio device reset: ret -123
[Mon Sep 7 15:43:38 2020] mmc1: card 0001 removed
[Mon Sep 7 15:43:38 2020] mmc1: new ultra high speed SDR104 SDIO card at address 0001
[Mon Sep 7 15:43:38 2020] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43455-sdio for chip BCM4345/6
[Mon Sep 7 15:43:38 2020] brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac43455-sdio for chip BCM4345/6
[Mon Sep 7 15:43:38 2020] brcmfmac: brcmf_c_process_clm_blob: no clm_blob available (err=-2), device may have limited channels available
[Mon Sep 7 15:43:38 2020] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4345/6 wl0: Jun 12 2020 12:11:45 version 7.45.96.66 (c7a32cb@shgit) (r745790) FWID 01-cffa7eb1
[Mon Sep 7 15:43:38 2020] ieee80211 phy1: brcmf_netdev_set_mac_address: Setting cur_etheraddr failed, -52
[Mon Sep 7 15:43:45 2020] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
I will note that after crashes now, I do note quite a bit of noise in the logs after one firmware crash. The kernel/SBC had not reset and I came back to working WiFi, so it did recover, but might have taken ~15 mins to do so in one case. Remaining cases appear to have recovered instantly.
[Mon Sep 7 15:53:53 2020] ieee80211 phy1: brcmf_cfg80211_get_station: GET STA INFO failed, -52
[Mon Sep 7 15:53:53 2020] ieee80211 phy1: brcmf_cfg80211_get_station: GET STA INFO failed, -52
[Mon Sep 7 15:53:59 2020] ieee80211 phy1: brcmf_cfg80211_get_station: GET STA INFO failed, -52
[Mon Sep 7 15:53:59 2020] ieee80211 phy1: brcmf_cfg80211_get_station: GET STA INFO failed, -52
[Mon Sep 7 15:54:05 2020] ieee80211 phy1: brcmf_cfg80211_get_station: GET STA INFO failed, -52
[Mon Sep 7 15:54:05 2020] ieee80211 phy1: brcmf_cfg80211_get_station: GET STA INFO failed, -52
[Mon Sep 7 15:54:11 2020] ieee80211 phy1: brcmf_cfg80211_get_station: GET STA INFO failed, -52