Rock5B RMA procedure?

Hi,

I got two Rock5B from AllNet China but it looks like one of the two has some HW problem.
It randomly crash with kernel stack traces after a few minutes running any stress test suite like stress-ng (both on the radxa produced images and on armbian).

The other works fine.

Anyone has tried to RMA their device ?

Paul

The most common cause of an unstable system like this has been memory reads occasionally returning corrupted data.

It is apparently fixed by newer DDR initialization code, but lowering the maximum frequency might workaround the problem, or at least can be used to test whether you are affected by this problem:

$ echo 1560000000 | sudo tee /sys/class/devfreq/dmc/max_freq
2 Likes

Please update to the latest image with newer rkbin blob. Also, check the power supply voltage with sensors command.

Where can I find the latest image ?
https://github.com/radxa/debos-radxa/releases/download/20221031-1045/rock-5b-debian-bullseye-xfce4-arm64-20221031-1558-gpt.img.xz ?

I’ll report the sensors output if I manage to boot it up without crashing.
This is the sensors output on the other board, same power supply:

admin@rock-5b:~$ sensors
gpu_thermal-virtual-0
Adapter: Virtual device
temp1: +30.5°C

littlecore_thermal-virtual-0
Adapter: Virtual device
temp1: +31.5°C

bigcore0_thermal-virtual-0
Adapter: Virtual device
temp1: +30.5°C

tcpm_source_psy_4_0022-i2c-4-22
Adapter: rk3x-i2c
in0: 20.00 V (min = +20.00 V, max = +20.00 V)
curr1: 3.00 A (max = +3.00 A)

npu_thermal-virtual-0
Adapter: Virtual device
temp1: +30.5°C

center_thermal-virtual-0
Adapter: Virtual device
temp1: +30.5°C

bigcore1_thermal-virtual-0
Adapter: Virtual device
temp1: +30.5°C

soc_thermal-virtual-0
Adapter: Virtual device
temp1: +31.5°C (crit = +115.0°C)

admin@rock-5b:~$

I loaded the latest debian images on the eMMC and booted it.
This is the output of sensors:

root@rock-5b:~# sensors
gpu_thermal-virtual-0
Adapter: Virtual device
temp1: +33.3°C

littlecore_thermal-virtual-0
Adapter: Virtual device
temp1: +34.2°C

bigcore0_thermal-virtual-0
Adapter: Virtual device
temp1: +34.2°C

tcpm_source_psy_4_0022-i2c-4-22
Adapter: rk3x-i2c
in0: 20.00 V (min = +20.00 V, max = +20.00 V)
curr1: 2.25 A (max = +2.25 A)

npu_thermal-virtual-0
Adapter: Virtual device
temp1: +33.3°C

center_thermal-virtual-0
Adapter: Virtual device
temp1: +33.3°C

bigcore1_thermal-virtual-0
Adapter: Virtual device
temp1: +34.2°C

soc_thermal-virtual-0
Adapter: Virtual device
temp1: +34.2°C (crit = +115.0°C)

root@rock-5b:~#

Crashed when running

stress-ng --class=memory --all=0 --timeout=120s --metrics

No interesting output on serial port.

I’ve lowered the maximum frequency to 528MHz and it’s still crashing.
No kernel stack trace with the latest image.
It just freeze or reboot.

With the latest image ubuntu server (rock-5b-ubuntu-focal-server-arm64-20221105-1012-gpt.img), it exhibit same type of crashes.
See attached minicom capture file for content.
minicom-20221106133900.zip (10.4 KB)

@jack Any idea there ? I’m willing to try to debug but the system is completely unusable at the moment.
Most of he time, ti does not even gets to the linux login prompt.

Got a new board thanks to @NBA and this one seems to be working fine after a quick stress-ng test…

It seems some boards have issues… My worked for three weeks while doing some reboots at random. After that it stopped to boot at all…
Got a response from raxda support with a suggestion to try maskrom boot, but the board is not seen on my windows os… Now I am looking for a way to get a replacement… Where did you file for the RMA? The site allnet china has no support section…

1 Like

I was contacted by womeone from allnet.cn after complaining loudly on this site.
I first contacted them on their email address without success.
I also emailed allnet.de but it did not prove more successful.

I also bought 3 of them, and one of them is very problematic. It is randomly crashing. I have been thinking it is an issue with the power supply as I am using PD power supplies. I got some stability with the Raspberry PI power supply but even that crashes after an hour or two. I am hoping to get hold of a 12v 3A power supply with a type c connector but its proving difficult to get in locally.

Have one (I got 11 out of 12 working) I can’t get to boot out of 12 ordered. 2 months so far no success. From https://shop.allnetchina.cn/.

Thanks, @icecream95 for pointing in this direction. By setting echo 1068000000 | sudo tee /sys/class/devfreq/dmc/max_freq on one of the boards I have managed to get it running without crashing for 5 hours. Which I have not managed on any of the boards. I have done this on the other 2 and will report on how it goes. :crossed_fingers:

Btw I had to give up a bit of performance but if it works I can live with that https://browser.geekbench.com/v5/cpu/compare/20044916?baseline=20043344. @jack hopefully this is something that can be fixed in the future.