Help! kern.log is filling with wifi errors!

Hi- (this is a 4B model)

Have been building my system from rockpi4_debian_stretch_lxde_armhf_20181105_2120-gpt.img and applied all upgrades per radxa wiki guide. tail - f shows this little guy has been getting waves of these pair of wifi driver errors, about 20/sec, then nothing for a few minutes or so. kern.log is up to 3MB in just the last 24 hours.

Apr 13 17:27:25 wx kernel: [18023.905428] _tdata_psh_info_pool_deq 200: Out of tdata_disc_grp
Apr 13 17:27:25 wx kernel: [18023.905442] dhd_tcpdata_info_get 1092: No more free tdata_psh_info!!

I see this error mentioned in other threads along with an SD driver issue, however the wifi issue does not seem to have a resolution that I could find.

Did I miss a driver update?
Thanks,
-K

1 Like

Hi,

Which kernel version uname -r?

4.4.154-82-rockchip-00022-gb99b90e

In the meantime, I’ve turned on logrotate to gzip it on a daily schedule.
I’ve noticed most log entries happen when running an editor over putty (such as moving the cursor around) or doing a file transfer.

This log is from the wifi driver:
drivers/net/wireless/rockchip_wlan/rkwifi/bcmdhd/dhd_ip.c
not sure why it comes out.

Is the wifi working fine? A lot of traffic over wifi?

Yes, the wifi driver. I’ve seen a similar message reported in other forum threads, so this issue is not unique to my configuration.

As mentioned previously, this seems to happen more under traffic load, not a lot of traffic, but lots of little packets. For example, I’m running headless, using SSH with putty from a pc, and I will get waves of these errors logged when editing and moving the cursor around (using nano, for example). These are all packets with tiny payloads.

I’ve further modified the system’s logging config to isolate kernal messages to it’s own log. Don’t care so much anymore about the log filling up, but having megabytes of useless log messages will conceal real issues, and the wifi driver should be fixed so it does not throw exceptions in the first place.

btw, this morning the kernel threw the following to every interface:

Message from syslogd@wx at Apr 21 02:05:20 …
kernel:[220660.347181] BUG: spinlock wrong CPU on CPU#0, dhd_dpc/446

Message from syslogd@wx at Apr 21 02:05:20 …
kernel:[220660.347219] lock: 0xffffffc0de4980d8, .magic: dead4ead, .owner: dhd_dpc/446, .owner_cpu: 1

so there is definitely something weird with the wireless driver!

configured additional logging, got the following trace:

Apr 27 00:18:09 wx kernel: [39759.443091] BUG: spinlock wrong CPU on CPU#0, dhd_dpc/383
Apr 27 00:18:09 wx kernel: [39759.443128] lock: 0xffffffc0e07700d8, .magic: dead4ead, .owner: dhd_dpc/383, .owner_cpu: 1
Apr 27 00:18:09 wx kernel: [39759.443155] CPU: 0 PID: 383 Comm: dhd_dpc Not tainted 4.4.154-82-rockchip-00022-gb99b90e #2
Apr 27 00:18:09 wx kernel: [39759.443169] Hardware name: ROCK PI 4B (DT)
Apr 27 00:18:09 wx kernel: [39759.443183] Call trace:
Apr 27 00:18:09 wx kernel: [39759.443214] [] dump_backtrace+0x0/0x220
Apr 27 00:18:09 wx kernel: [39759.443239] [] show_stack+0x24/0x30
Apr 27 00:18:09 wx kernel: [39759.443264] [] dump_stack+0x98/0xc0
Apr 27 00:18:09 wx kernel: [39759.443288] [] spin_dump+0x84/0xa4
Apr 27 00:18:09 wx kernel: [39759.443309] [] spin_bug+0x30/0x3c
Apr 27 00:18:09 wx kernel: [39759.443332] [] do_raw_spin_unlock+0xac/0xd0
Apr 27 00:18:09 wx kernel: [39759.443355] [] _raw_spin_unlock_irqrestore+0x24/0x34
Apr 27 00:18:09 wx kernel: [39759.444690] [] dhd_rx_frame+0x41c/0x5bc [bcmdhd]
Apr 27 00:18:09 wx kernel: [39759.446001] [] dhdsdio_readframes+0x12a8/0x1450 [bcmdhd]
Apr 27 00:18:09 wx kernel: [39759.447286] [] dhd_bus_dpc+0x77c/0xb40 [bcmdhd]
Apr 27 00:18:09 wx kernel: [39759.448565] [] dhd_dpc_thread+0x110/0x188 [bcmdhd]
Apr 27 00:18:09 wx kernel: [39759.448591] [] kthread+0xe0/0xf0
Apr 27 00:18:09 wx kernel: [39759.448613] [] ret_from_fork+0x10/0x20

More data: have been transfering 300GB of data via wifi over the last 12hrs, no errors. These are almost all big packets to RockPi. The issue seems to only happen with small ‘keystroke’ packets using nano over SSH/putty.

If I fire up an editor over SSH and just scroll around using the arrow keys, the errors will flood consistently.

Strange. In the mean time, I’ve switched to a cabled Ethernet and have the wireless interface enabled only as a backup means to log in should the cabled interface fail for whatever reason. In the last couple of months, I’ve rebooted and re-homed the wireless a few times but the small packet issues (particularly with nano editing sessions) persist. This is perhaps one of the most bizarre driver issues I’ve seen…why would small packets cause it…streaming 100’s of gigabytes of data in bulk is no problem at all.

Well, with the combination of log rotation and moving off the wifi, the issue is isolated and not impactful.

as from my apr-15 post