Bootloader breaking intermittently

I’m booting armbian on an nvme drive and am using the related boot image from here:

https://wiki.radxa.com/Rock5/install/spi#2.29_Get_RK3588_loader_and_U-Boot_images

This works fine for a certain amount of time, usually at least a week, but then for no apparent reason the rock5b won’t boot anymore. Removing and checking the nvme drive on another machine shows everything to be in place so it’s not an issue with the drive. Reinstalling the bootloader then allows the 5b too boot normally with no other changes. So it’s like the bootloader is breaking or becoming corrupt somehow.

Anyone else experienced this and have any suggestions to stop this happening - or say knows of an alternative bootloader for nvme I can get try instead?

If you are booting Armbian, I suggest you use the Armbian way to install to the SPI bootloader. Use the

armbian-install

command to install the SPI flash bootloader.

Thanks @jack I’ll give it a go. Does that method make booting any different or does it basically give the same end result as flashing in maskrom mode?

I have exactly the same issue. In a week or so I need to reflash SPI. First I thought it’s nvme fault. Next realised that it’s a SPI flash. Last time it happened when I left my board disconnected from power supply. Maybe SPI flash requires li-on battery to be constantly connected or power supply to be connected all the time. Anyway it’s not normal. SPI flash should be non-volatile memory. Will continue observations.

Interesting! Just flashed SPI, did sync command and checked md5sum

Doing reboot and checking md5sum again

SPI corrupted after first reboot.
Any ideas how to deal with it?

You can disable the spi flash node in the device tree to prevent linux kernel from writing /dev/mtdblock0.
Create a file named disable-spi-flash.dts with these contents:

/dts-v1/;
/plugin/;
/ {
        fragment@0 {
                target = <&sfc>;
                __overlay__ {
                        status = "disabled";
                };
        };
};

Run command sudo armbian-add-overlay disabled-spi-flash.dts and reboot.

This will only prevent linux kernel from writing spi flash, but I think uboot itself may also write to it, for example writing uboot envs or some vendor ids to vendor storage.
Document: https://gitlab.com/rk3588_linux/linux/bsp/docs/-/blob/linux-5.10-gen-rkr3.5/Common/UBOOT/Rockchip_Developer_Guide_UBoot_Nextdev_CN.pdf

Thanks, I will try to disable spi node and will see what will happen in a week

I’m getting the same issue with the Debian CLI image on NVME. But reflashing the SPI isn’t helping. I can now only boot from eMMC :frowning_face:

When I check with mdsum after rebooting:

c72e807fcfd4d1998a91dac803224880 /dev/mtdblock0
46de85de37b8e670883e6f6a8bb95776 rock-5b-spi-image-g49da44e116d.img

I don’t recognise the md5sum either. 2c7ab… is for the zero.img, and 46de8… is for the rock-5b-spi-image-g49da44e116d.img. Where is c72e8… from!?

@dnhkng may be worth mentioning flashing SPI often doesn’t work first time for me - I regularly have to do it a number of times and then for seemingly no reason on a particular random attempt I can boot off the NVME again. Until it breaks again anyway.

1 Like

Quick fix is to erase SPI flash, install server on microSD with grub and to direct boot to nvme drive.
Main question - is SPI flash corruption a hardware or software problem. If something wrong with uboot we are flashing then it’s probably a question of time and developers can handle it someday. If it’s a hardware problem - we are in trouble :slight_smile:

1 Like

which power supply do you use with or without PD ?

With PD is a regular Radxa power supply. Without PD is 12V 3A via usb type-C connector.

boot loader is trash. i spent 2 weeks looking at it. set a cron to reboot every 3-4 min. and watch it die. its not a power issue. its only an issue with booting from nvme. micro sd works. booting from nvme is a no go.

1 Like

I’ve ended up erasing the SPI, installing Armbian on an SD card purely to be able to boot into an environment, then from the SD card version of Armbian update armbianEnv.txt in the boot partition (/boot/armbianEnv.txt) to point at the /dev/… second partition of the nvme drive as the root device. This boots to the nvme fine now, and with the SPI empty I’m hoping will bypass the original issue completely.

I’ll give it a few weeks and see if this appears to have resolved the problem.

1 Like

I admit that there may be problems with specific instances of rock 5b (SPI\MTD hardware problem), but when using a power supply without PD and the correct u-boot loader in SPI\MTD and NVMe systems from this topic, no one complained about such a problem. My instance has been running stably for several months (u-boot in SPI\MTD and the system on NVMe).

Please leave the SPI flash not used, and write some random data to it and save the md5. Check later if the md5 changes to identify if it’s a hardware issue.

1 Like

I flashed zero.img. Many reboots - md5sum is the same. It’s not a hardware problem.

Problem was exactly with Balbes-150 bootloader and image first. And next with regular armbian spi bootloader and balbes-150 jammy legacy image

Currently I have an interesting situation - SPI flash is erased, but board ignores microSD and mmc. All content of micro SD and mmc is visible but boot starts from nvme drive only. Need to try to disconnect nvme drive, will do it later. Board is packed in a case.

Do you have eMMC connected ?

Just so you you know i use the official power supply and this behaves on all 3 5bs i boaught to test with. i was looking to replace the pi4b implementations we have. super excited about a nvme boot option. all of our system get deployed in remote locations so they have to reboot without intervention. the 5b works flawlessly on mico sd card just not the spi boot with nvme.