Ubuntu server becomes unresponsive every few days (RockPi4B 4GB) - bcmsdh_sdmmc?

I have my Rock Pi set up with an MMC card for boot, and Ubuntu running on an NVME disk. It’s running headless, with a wired Ethernet connection on to my home LAN.

Every few days the server becomes non-responsive. A hard reboot (unplug power) fixes the issue, but in trying to get to the bottom of it I find that /var/log/syslog had thousands of these entries when the problem last occured:

May 2 16:56:25 server kernel: [772129.400159] bcmsdh_sdmmc: Failed to Write byte F1:@0x1001f=00, Err: -5
May 2 16:56:25 server kernel: [772129.402729] bcmsdh_sdmmc: Failed to Write byte F1:@0x1001f=00, Err: -5
May 2 16:56:25 server kernel: [772129.405394] bcmsdh_sdmmc: Failed to Write byte F1:@0x1001f=00, Err: -5
May 2 16:56:25 server kernel: [772129.407602] bcmsdh_sdmmc: Failed to Read byte F1:@0x1001f=ff, Err: -5
May 2 16:56:25 server kernel: [772129.410265] bcmsdh_sdmmc: Failed to Read byte F1:@0x1001f=ff, Err: -5
May 2 16:56:25 server kernel: [772129.412784] bcmsdh_sdmmc: Failed to Read byte F1:@0x1001f=ff, Err: -5

Based on the continuing log entries (interspersed occasionally with more normal messages) the server is still working, but just not responding (or timing out) on network requests.

Radxa repository has been added and the various rockchip addons installed, and apt-get update && apt-get upgrade are run regularly.

Any ideas?

I blacklist that module and it seems to have solved my problems. I also used a wired connection so this works for me.

I have the same issue. For now I added a cronjob to reboot it daily which isn’t really a good fix. Had it already once that I still need to manually reboot.

@smlikens How did you blacklist that module?

I traced bcmsdh_sdmmc back to bchdhd using the lsmod command, which some googling appeared to confirm to be correct.

I then simply added bchdhd to the bottom of the /etc/modprobe.d/blacklist.conf file as:
blacklist bcmdhd

Rebooted and so far so good (although obviously no wifi)…

Thanks to @smlikens for the tip :smiley:

Had the same issue two days ago. Accompanied with a chip hot enough to hurt on touching the passive cooler. Pretty sure that isn’t good. Only got it a few days ago. Haven’t even done much with the machine yet. :confused:

I’ve deactivate the wifi through NetworkManager since though, as I’m not using it at home and it was spaming logs every five minutes or so with scans and attempted changes to the MAC address or some such for whatever reason.

Saved the logs of that day if it could be helpful though.

Do you have the small heatsink or the big black one? The big one is much better. The peak power consumption I’ve logged when running hot is about 8.5W, it’s not too bad.

Only got the small heatsink that came with the case.
I’m not expecting to tax the machine too much though, mostly basic server administration and a few docker containers, running as various ondemand services only, I expect.
But if that particular crash or freeze or whatever happens again I might not realise it in time… I suppose I’ll monitor the situation closely.

same here, RockPi 4b 4GB with big headsink and 30x30 fan

After blacklisting bcmdhd issue gone but no wifi.