Do I have a bad board?

I just powered one of the 5Bs that I purchased during the launch. Only this board has issues. They all have the same USB brick and heatsink/fan from Allnet.

What happens is that I run a few commands then I get a bunch of kernel messages, then it just freezes. Temps seem to be stable at around 32 C.

Is it a hardware issue? If so, how has anyone’s warranty issues been handled? I haven’t heard back from radxa or allnet.

root@rock-5b:~# dmesg
[ 1.714110] Booting Linux on physical CPU 0x0000000000 [0x412fd050]
[ 1.714132] Linux version 5.10.66-27-rockchip-gea60d388902d (stephen@lara) (gcc (Ubuntu/Linaro 7.5.0-6ubuntu2) 7.5.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #rockchip SMP Mon Oct 24 08:25:47 UTC 2022
[ 1.723299] Machine model: Radxa ROCK 5B
[ 1.723468] efi: UEFI not found.
[ 1.726493] OF: fdt: Reserved memory: failed to reserve memory for node ‘drm-logo@00000000’: base 0x0000000000000000, size 0 MiB
[ 1.726509] OF: fdt: Reserved memory: failed to reserve memory for node ‘drm-cubic-lut@00000000’: base 0x0000000000000000, size 0 MiB
[ 1.726556] Reserved memory: created CMA memory pool at 0x0000000010000000, size 256 MiB
[ 1.726561] OF: reserved mem: initialized node cma, compatible id shared-dma-pool
[ 1.989614] Zone ranges:
[ 1.989624] DMA [mem 0x0000000000200000-0x00000000ffffffff]
[ 1.989635] DMA32 empty
[ 1.989641] Normal [mem 0x0000000100000000-0x00000003ffefffff]
[ 1.989649] Movable zone start for each node
[ 1.989652] Early memory node ranges
[ 1.989657] node 0: [mem 0x0000000000200000-0x00000000efffffff]
[ 1.989664] node 0: [mem 0x0000000100000000-0x00000003fbffffff]
[ 1.989672] node 0: [mem 0x00000003fc500000-0x00000003ffefffff]
[ 1.989678] Initmem setup node 0 [mem 0x0000000000200000-0x00000003ffefffff]
[ 1.989685] On node 0 totalpages: 4126720
[ 1.989691] DMA zone: 15352 pages used for memmap
[ 1.989695] DMA zone: 0 pages reserved
[ 1.989700] DMA zone: 982528 pages, LIFO batch:63
[ 1.989705] Normal zone: 49148 pages used for memmap
[ 1.989710] Normal zone: 3144192 pages, LIFO batch:63
[ 2.076608] On node 0, zone Normal: 256 pages in unavailable ranges
[ 2.076769] psci: probing for conduit method from DT.
[ 2.076780] psci: PSCIv1.1 detected in firmware.
[ 2.076784] psci: Using standard PSCI v0.2 function IDs
[ 2.076790] psci: MIGRATE_INFO_TYPE not supported.
[ 2.076796] psci: SMC Calling Convention v1.2
[ 2.077177] percpu: Embedded 31 pages/cpu s88680 r8192 d30104 u126976
[ 2.077232] pcpu-alloc: s88680 r8192 d30104 u126976 alloc=31*4096
[ 2.077239] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 [0] 4 [0] 5 [0] 6 [0] 7
[ 2.077404] Detected VIPT I-cache on CPU0
[ 2.077454] CPU features: detected: GIC system register CPU interface
[ 2.077459] CPU features: detected: Virtualization Host Extensions
[ 2.077469] CPU features: detected: ARM errata 1165522, 1319367, or 1530923
[ 2.077476] alternatives: patching kernel code
[ 2.077810] Built 1 zonelists, mobility grouping on. Total pages: 4062220
[ 2.077819] Kernel command line: root=UUID=67ad0e7b-3914-48d6-97c2-c48e5e0e405b console=ttyFIQ0 console=tty1 consoleblank=0 loglevel=0 panic=10 rootwait rw init=/sbin/init rootfstype=ext4 cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory swapaccount=1 irqchip.gicv3_pseudo_nmi=0 switolb=1 coherent_pool=2M
[ 2.079749] Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes, linear)
[ 2.080464] Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes, linear)
[ 2.080471] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 2.086318] software IO TLB: mapped [mem 0x00000000ec000000-0x00000000f0000000] (64MB)
[ 2.100498] BUG: Bad page state in process swapper pfn:020dc
[ 2.100511] page:(ptrval) refcount:0 mapcount:-52224 mapping:0000000000000000 index:0x0 pfn:0x20dc
[ 2.100517] flags: 0x0()
[ 2.100526] raw: 0000000000000000 fffffffeffe83708 fffffffeffe83708 0000000000000000
[ 2.100533] raw: 0000000000000000 0000000000000000 00000000ffff33ff 0000000000000000
[ 2.100537] page dumped because: nonzero mapcount
[ 2.100541] Modules linked in:
[ 2.100554] CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.66-27-rockchip-gea60d388902d #rockchip
[ 2.100558] Hardware name: Radxa ROCK 5B (DT)
[ 2.100563] Call trace:
[ 2.100575] dump_backtrace+0x0/0x1a8
[ 2.100581] show_stack+0x2c/0x38
[ 2.100589] dump_stack_lvl+0xd0/0xfc
[ 2.100594] dump_stack+0x14/0x2c
[ 2.100601] bad_page+0xfc/0x100
[ 2.100606] check_free_page+0x90/0xa8
[ 2.100612] __free_pages_ok+0x1c4/0x1e8
[ 2.100617] __free_pages_core+0xa0/0xbc
[ 2.100623] memblock_free_pages+0x28/0x34
[ 2.100629] memblock_free_all+0x180/0x1e8
[ 2.100636] mem_init+0x64/0x84
[ 2.100642] start_kernel+0x2f4/0x574
[ 2.100646] Disabling lock debugging due to kernel taint
[ 2.100819] BUG: Bad page state in process swapper pfn:030dc
[ 2.100826] page:(ptrval) refcount:0 mapcount:-49152 mapping:0000000000000000 index:0x0 pfn:0x30dc
[ 2.100831] flags: 0x0()
[ 2.100838] raw: 0000000000000000 fffffffeffec3708 fffffffeffec3708 0000000000000000
[ 2.100845] raw: 0000000000000000 0000000000000000 00000000ffff3fff 0000000000000000
[ 2.100848] page dumped because: nonzero mapcount
[ 2.100851] Modules linked in:
[ 2.100861] CPU: 0 PID: 0 Comm: swapper Tainted: G B 5.10.66-27-rockchip-gea60d388902d #rockchip
[ 2.100864] Hardware name: Radxa ROCK 5B (DT)
[ 2.100868] Call trace:
[ 2.100874] dump_backtrace+0x0/0x1a8
[ 2.100880] show_stack+0x2c/0x38
[ 2.100885] dump_stack_lvl+0xd0/0xfc
[ 2.100891] dump_stack+0x14/0x2c
[ 2.100896] bad_page+0xfc/0x100
[ 2.100901] check_free_page+0x90/0xa8
[ 2.100906] __free_pages_ok+0x1c4/0x1e8
[ 2.100912] __free_pages_core+0xa0/0xbc
[ 2.100917] memblock_free_pages+0x28/0x34
[ 2.100922] memblock_free_all+0x180/0x1e8
[ 2.100928] mem_init+0x64/0x84
[ 2.100933] start_kernel+0x2f4/0x574
[ 2.101093] BUG: Bad page state in process swapper pfn:040dc
[ 2.101100] page:(ptrval) refcount:0 mapcount:-49152 mapping:0000000000000000 index:0x0 pfn:0x40dc
[ 2.101105] flags: 0x0()
[ 2.101112] raw: 0000000000000000 fffffffefff03708 fffffffefff03708 0000000000000000
[ 2.101118] raw: 0000000000000000 0000000000000000 00000000ffff3fff 0000000000000000
[ 2.101122] page dumped because: nonzero mapcount
[ 2.101125] Modules linked in:
[ 2.101134] CPU: 0 PID: 0 Comm: swapper Tainted: G B 5.10.66-27-rockchip-gea60d388902d #rockchip
[ 2.101138] Hardware name: Radxa ROCK 5B (DT)
[ 2.101141] Call trace:
[ 2.101147] dump_backtrace+0x0/0x1a8
[ 2.101153] show_stack+0x2c/0x38
[ 2.101158] dump_stack_lvl+0xd0/0xfc
[ 2.101164] dump_stack+0x14/0x2c
[ 2.101169] bad_page+0xfc/0x100
[ 2.101174] check_free_page+0x90/0xa8
[ 2.101179] __free_pages_ok+0x1c4/0x1e8
[ 2.101184] __free_pages_core+0xa0/0xbc
[ 2.101190] memblock_free_pages+0x28/0x34
[ 2.101195] memblock_free_all+0x180/0x1e8
[ 2.101200] mem_init+0x64/0x84
[ 2.101206] start_kernel+0x2f4/0x574
[ 2.101366] BUG: Bad page state in process swapper pfn:050dc
[ 2.101373] page:(ptrval) refcount:0 mapcount:-49152 mapping:0000000000000000 index:0x0 pfn:0x50dc
[ 2.101377] flags: 0x0()
[ 2.101384] raw: 0000000000000000 fffffffefff43708 fffffffefff43708 0000000000000000
[ 2.101391] raw: 0000000000000000 0000000000000000 00000000ffff3fff 0000000000000000
[ 2.101394] page dumped because: nonzero mapcount
[ 2.101397] Modules linked in:
[ 2.101407] CPU: 0 PID: 0 Comm: swapper Tainted: G B 5.10.66-27-rockchip-gea60d388902d #rockchip
[ 2.101411] Hardware name: Radxa ROCK 5B (DT)
[ 2.101414] Call trace:
[ 2.101420] dump_backtrace+0x0/0x1a8
[ 2.101425] show_stack+0x2c/0x38
[ 2.101431] dump_stack_lvl+0xd0/0xfc
[ 2.101436] dump_stack+0x14/0x2c
[ 2.101442] bad_page+0xfc/0x100
[ 2.101447] check_free_page+0x90/0xa8
[ 2.101452] __free_pages_ok+0x1c4/0x1e8
[ 2.101458] __free_pages_core+0xa0/0xbc
[ 2.101463] memblock_free_pages+0x28/0x34
[ 2.101468] memblock_free_all+0x180/0x1e8
[ 2.101474] mem_init+0x64/0x84
[ 2.101479] start_kernel+0x2f4/0x574
[ 2.101639] BUG: Bad page state in process swapper pfn:060dc
[ 2.101646] page:(ptrval) refcount:0 mapcount:-49152 mapping:0000000000000000 index:0x0 pfn:0x60dc
[ 2.101650] flags: 0x0()
[ 2.101657] raw: 0000000000000000 fffffffefff83708 fffffffefff83708 0000000000000000
[ 2.101664] raw: 0000000000000000 0000000000000000 00000000ffff3fff 0000000000000000
[ 2.101667] page dumped because: nonzero mapcount
[ 2.101671] Modules linked in:
[ 2.101680] CPU: 0 PID: 0 Comm: swapper Tainted: G B 5.10.66-27-rockchip-gea60d388902d #rockchip
[ 2.101683] Hardware name: Radxa ROCK 5B (DT)
[ 2.101687] Call trace:
[ 2.101693] dump_backtrace+0x0/0x1a8
[ 2.101698] show_stack+0x2c/0x38
[ 2.101704] dump_stack_lvl+0xd0/0xfc
[ 2.101709] dump_stack+0x14/0x2c
[ 2.101715] bad_page+0xfc/0x100
[ 2.101720] check_free_page+0x90/0xa8
[ 2.101725] __free_pages_ok+0x1c4/0x1e8
[ 2.101730] __free_pages_core+0xa0/0xbc
[ 2.101735] memblock_free_pages+0x28/0x34
[ 2.101741] memblock_free_all+0x180/0x1e8
[ 2.101747] mem_init+0x64/0x84
[ 2.101752] start_kernel+0x2f4/0x574
[ 2.101914] BUG: Bad page state in process swapper pfn:070dc
[ 2.101920] page:(ptrval) refcount:0 mapcount:-49152 mapping:0000000000000000 index:0x0 pfn:0x70dc
[ 2.101925] flags: 0x0()
[ 2.101932] raw: 0000000000000000 fffffffefffc3708 fffffffefffc3708 0000000000000000
[ 2.101938] raw: 0000000000000000 0000000000000000 00000000ffff3fff 0000000000000000
[ 2.101942] page dumped because: nonzero mapcount
[ 2.101945] Modules linked in:
[ 2.101954] CPU: 0 PID: 0 Comm: swapper Tainted: G B 5.10.66-27-rockchip-gea60d388902d #rockchip
[ 2.101958] Hardware name: Radxa ROCK 5B (DT)
[ 2.101961] Call trace:
[ 2.101967] dump_backtrace+0x0/0x1a8
[ 2.101972] show_stack+0x2c/0x38
[ 2.101978] dump_stack_lvl+0xd0/0xfc
[ 2.101983] dump_stack+0x14/0x2c
[ 2.101988] bad_page+0xfc/0x100
[ 2.101993] check_free_page+0x90/0xa8
[ 2.101999] __free_pages_ok+0x1c4/0x1e8
[ 2.102004] __free_pages_core+0xa0/0xbc
[ 2.102009] memblock_free_pages+0x28/0x34
[ 2.102015] memblock_free_all+0x180/0x1e8
[ 2.102021] mem_init+0x64/0x84
[ 2.102026] start_kernel+0x2f4/0x574
[ 2.102187] BUG: Bad page state in process swapper pfn:080dc
[ 2.102194] page:(ptrval) refcount:0 mapcount:-49152 mapping:0000000000000000 index:0x0 pfn:0x80dc
[ 2.102198] flags: 0x0()
[ 2.102205] raw: 0000000000000000 ffffffff00003708 ffffffff00003708 0000000000000000
[ 2.102212] raw: 0000000000000000 0000000000000000 00000000ffff3fff 0000000000000000
[ 2.102215] page dumped because: nonzero mapcount
[ 2.102219] Modules linked in:
[ 2.102228] CPU: 0 PID: 0 Comm: swapper Tainted: G B 5.10.66-27-rockchip-gea60d388902d #rockchip
[ 2.102231] Hardware name: Radxa ROCK 5B (DT)
[ 2.102235] Call trace:
[ 2.102241] dump_backtrace+0x0/0x1a8
[ 2.102246] show_stack+0x2c/0x38
[ 2.102252] dump_stack_lvl+0xd0/0xfc
[ 2.102257] dump_stack+0x14/0x2c
[ 2.102262] bad_page+0xfc/0x100
[ 2.102268] check_free_page+0x90/0xa8
[ 2.102273] __free_pages_ok+0x1c4/0x1e8
[ 2.102278] __free_pages_core+0xa0/0xbc
[ 2.102283] memblock_free_pages+0x28/0x34
[ 2.102289] memblock_free_all+0x180/0x1e8
[ 2.102294] mem_init+0x64/0x84
[ 2.102300] start_kernel+0x2f4/0x574
[ 2.102461] BUG: Bad page state in process swapper pfn:090dc

@RadxaYuntian @jack

I’m not sure whom to contact. I’ve exhausted the published means of contacting support.

@darkmode

What power adapter are you using for this ROCK 5B? What’s the voltage negotiated? Output of the sensor command?

@jack :wave:t4: I’m using the Radxa branded one from Allnet.

The other 3 5Bs I have boot fine and are stable using the same emmc and power adapter. The symptom also persists when using PoE. Boots for a while then the terminal freezes.

Having more than one board for sure helps to debug several things,
are You sure they have same bootloader and spi? Do You have any serial console to view boot process as well as some power meter to see power negotiation? It also may be able to dump more information on output when it’s not possible to write them to logs.

Early bootloader has some issues with PD and that caused bootloops, but somehow playing with RK dev tools it’s possible to revert bootloader to older one, I’m not sure how that is possible but I seen that few times. Bootloader version is one first lines on output starting with DDR…

Yes. I managed to run md5sum against the spi partition and normal bootloader img. It boots fine. It’s just that as soon as any command is issued these reoccurring cpu messages show up, then it just freezes.

Sounds ugly, do You have access to uart console?
Anything heats up?

Yep it gets warm. I was able to make contact and RMA the board. The other 3 are all running strong without issues.

It should be quite easy to find faulty board among other with same cables/power etc. Also it’s quite easy to compare what may be wrong about them. Have You You any chance to check from what comes that heat? SOC? was it stable under heavy cooling?
I have two more new Rock5 boards yet not unpacked, hopefully they are OK :slight_smile: I will have chance to test them about june.

The SOC with the tall Radxa heatsink was barely warm to the touch. Temps were about 32 degrees under load performing ‘apt upgrade’ before it started locking up. It would also lock up just sitting idle.

I tried multiple approaches. Swapping out the large heatsink for the heatsink/fan combo that finally arrived. No difference. And my last janky test by placing the board directly behind a PowerEdge server exhaust fan. Same results.

Conversely, my other 3 boards are operating container loads with just a thin copper shim acting as a heatsink and a PoE Hat fan combo. Absolutely no cooling issues in a 3d printed case. Haha

https://www.amazon.com/dp/B08LPT9PYX?ref=ppx_pop_mob_ap_share

1 Like

After returning the board it appears the “pmu was burned”. Probably got shorted somehow. :man_shrugging:t5:

I will have a replacement back shortly!

Thanks for this insight,
I just have problems with one mine R4B+ that I need to RMA, I always want to be sure how much I can do b]to diagnose problem myself. I cannot get nvme work on that board and it eventually failed.

BTW: how ROCK5 works with POE?

PoE has been working flawless on my 2.5Gb Mokerlink switch. I only wish it was a managed switch but this will do fine.

Do You get full 2.5G speed? What poe budget is needed with nvme?
Is poe+ enough?

I do get full 2.5G speed. I have the 8 port and it should work out to 15W per port (maybe less?).

Well this is a bummer.

I got the “fixed” board back but it has the same failure. I made sure to be super careful too. My other 3 boards are still doing fine.