NVMe Boot Loop (Power)

Regardless of u-boot (tired release, debug and build a fresh one),
Rock5b fails to boot from SPI+NVMe SSD most of the time. The problem is with a power circuit.

DDR Version V1.08 20220617
LPDDR4X, 2112MHz
...
INFO:    BL31: Initialising Exception Handling Framework
INFO:    BL31: Initializing runtime services
WARNING: No OPTEE provided by BL2 boot loader, Booting device without OPTEE initialization. SMC`s destined for OPTEE will return SMC_UNK
ERROR:   Error initializing runtime service opteed_fast
INFO:    BL31: Preparing for EL3 exit to normal world
INFO:    Entry point address = 0x200000
INFO:    SPSR = 0x3c9

DDR Version V1.08 20220617
LPDDR4X, 2112MHz
channel[0] BW=16 Col=10 Bk=8 CS0 Row=17 CS1 Row=17 CS=2 Die BW=8 Size=4096MB
channel[1] BW=16 Col=10 Bk=8 CS0 Row=17 CS1 Row=17 CS=2 Die BW=8 Size=4096MB
channel[2] BW=16 Col=10 Bk=8 CS0 Row=17 CS1 Row=17 CS=2 Die BW=8 Size=4096MB
channel[3] BW=16 Col=10 Bk=8 CS0 Row=17 CS1 Row=17 CS=2 Die BW=8 Size=4096MB
Manufacturer ID:0x6 

SSD works if:

  • If I power it directly to GPIO 2 and 4, with voltage boosted to 5.2V. (anything less wasn’t working)
  • If kernel boots from SDCard, (probably due to some delayed initialization)
  • … and occasionally boots as expected.

After boot using USB power, sensors shows 20V input. After boot the kernel can handle PD chargers properly.

My assumption is:
u-boot tries to initialize pcie and power on SSD, before configuring PD voltage.
Something happens when Kernel changes voltage level. It looses power.

How can I fix it?

Tried to cherry-pick this commit, with changed PDOs, like

sink-pdos =
				<PDO_FIXED(5000, 3000, PDO_FIXED_USB_COMM)
				 PDO_VAR(5000, 20000, 5000)>;

Rebuild u-boot and flashed SPI.

And It didn’t help.

What PD power adapter are you using?

Update:

  1. back to NVME boot. Enabled u-boot debug. I can enumerate pcie, check nvme device info, etc.
=> pci enumerate
invalid flags type!
=> pci enumerate
=> pci scan     
Scanning PCI devices on bus 0
BusDevFun  VendorId   DeviceId   Device Class       Sub-Class
_____________________________________________________________
00.00.00   0x1d87     0x3588     Bridge device           0x04

I can also receive nvme details

=> nvme scan
=> nvme info
Device 0: Vendor: 0x15b7 Rev: 711240WD Prod: 21440T800664
            Type: Hard Disk
            Capacity: 953869.7 MB = 931.5 GB (1953525168 x 512)

I can check partitions

=> part list nvme 0

Partition Map for NVMe device 0  --   Partition Type: EFI

Part    Start LBA       End LBA         Name
        Attributes
        Type GUID
        Partition GUID
  1     0x00008000      0x00107fff      "boot"
        attrs:  0x0000000000000000
        type:   c12a7328-f81f-11d2-ba4b-00a0c93ec93b
        guid:   b602c2bd-f201-4323-bdc0-c8633808d063
  2     0x00108000      0x7470660f      "rootfs"
        attrs:  0x0000000000000000
        type:   0fc63daf-8483-4772-8e79-3d69d8477de4
        guid:   c171e7de-8299-4a64-9250-3206b6768a2e

but if I try to boot, it shuts down and reboots

=> setenv devtype nvme
=> setenv devnum 0    
=> run nvme_boot 0 
Device 0: Vendor: 0x15b7 Rev: 711240WD Prod: 21440T800664
            Type: Hard Disk
            Capacity: 953869.7 MB = 931.5 GB (1953525168 x 512)
... is now current device
Scanning nvme 0:1...
Found /extlinux/extlinux.conf
Retrieving file: /extlinux/extlinux.conf
reading /extlinux/extlinux.conf
1968 bytes read in 0 ms
1:      kernel-5.10.66-27-rockchip-gea60d388902d
Retrieving file: /initrd.img-5.10.66-27-rockchip-gea60d388902d
reading /initrd.img-5.10.66-27-rockchip-gea60d388902d
9329636 bytes read in 9 ms (988.6 MiB/s)
Retrieving file: /vmlinuz-5.10.66-27-rockchip-gea60d388902d
reading /vmlinuz-5.10.66-27-rockchip-gea60d388902d
29635072 bytes read in 26 ms (1.1 GiB/s)
append: root=UUID=a4b2a864-b934-40b1-bceb-cd39f0997628 earlycon=uart8250,mmio32,0xfeb50000 console=ttyFIQ0 console=tty1 consoleblank=0 loglevel=0 panic=10 rootwait rw init=/sbin/init rootfstype=ext4 cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory swapaccount=1 irqchip.gicv3_pseudo_nmi=0 switolb=1 coherent_pool=2M
Retrieving file: /dtbs/5.10.66-27-rockchip-gea60d388902d/rockchip/rk3588-rock-5b.dtb
reading /dtbs/5.10.66-27-rockchip-gea60d388902d/rockchip/rk3588-rock-5b.dtb
251760 bytes read in 2 ms (120 MiB/s)
Retrieving file: /dtbs/5.10.66-27-rockchip-gea60d388902d/rockchip/overlay/rk3588-uart7-m2.dtbo
reading /dtbs/5.10.66-27-rockchip-gea60d388902d/rockchip/overlay/rk3588-uart7-m2.dtbo
311 bytes read in 1 ms (303.7 KiB/s)
Fdt Ramdisk skip relocation
No misc partition
## Flattened Device Tree blob at 0x0a100000
   Booting using the fdt blob at 0x0a100000
   reserving fdt memory region: addr=a100000 size=40000
  'reserved-memory' cma: addr=10000000 size=10000000
   Using Device Tree in place at 000000000a100000, end 000000000a142fff
Adding bank: 0x00200000 - 0xf0000000 (size: 0xefe00000)
Adding bank: 0x100000000 - 0x3fc000000 (size: 0x2fc000000)
Adding bank: 0x3fc500000 - 0x3fff00000 (size: 0x03a00000)
Total: 117407.197 ms

Starting kernel ...

[  117.425687] Booting Linux on physical CPU 0x0000000000 [0x412fd050]
[  117.425709] Linux version 5.10.66-27-rockchip-gea60d388902d (stephen@lara) (gcc (Ubuntu/Linaro 7.5.0-6ubuntu2) 7.5.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #rockchip SMP Mon Oct 24 08:25:47 UTC 2022
[  117.435116] Machine model: Radxa ROCK 5B
[  117.435155] earlycon: uart8250 at MMIO32 0x00000000feb50000 (options '')
[  117.438292] printk: bootconsole [uart8250] enabled

and then reboot

Different xiaomi (old laptop 65w, 2port GaN 65W, 1port GaN 65W).
All of them provide 5V3A and up to 20V3.25A

But same stupid thing happens, if I power it into GPIO pins with voltages lower than 5.2-5.3V.

Seems like problem in Kernel. Enabling debug :confused:

These are last logs before reboot. full log pastebin

[    6.180269] rk-pcie fe190000.pcie: PCIe Link up, LTSSM is 0x130011
[    6.180647] rk-pcie fe190000.pcie: PCI host bridge to bus 0004:40
[    6.180696] pci_bus 0004:40: root bus resource [bus 40-4f]
[    6.180734] pci_bus 0004:40: root bus resource [??? 0xf4000000-0xf40fffff flags 0x0]
[    6.180783] pci_bus 0004:40: root bus resource [io  0x200000-0x2fffff] (bus address [0xf4100000-0xf41fffff])
[    6.180831] pci_bus 0004:40: root bus resource [mem 0xf4200000-0xf4ffffff]
[    6.180869] pci_bus 0004:40: root bus resource [mem 0xa00000000-0xa3fffffff pref]
[    6.180975] pci 0004:40:00.0: [1d87:3588] type 01 class 0x060400
[    6.181044] pci 0004:40:00.0: reg 0x38: [mem 0x00000000-0x0000ffff pref]
[    6.181177] pci 0004:40:00.0: supports D1 D2
[    6.181209] pci 0004:40:00.0: PME# supported from D0 D1 D3hot
[    6.195068] pci 0004:40:00.0: Primary bus is hard wired to 0
[    6.195109] pci 0004:40:00.0: bridge configuration invalid ([bus 01-ff]), reconfiguring
[    6.195331] pci 0004:41:00.0: [10ec:8125] type 00 class 0x020000
[    6.195398] pci 0004:41:00.0: reg 0x10: [io  0x0000-0x00ff]
[    6.195467] pci 0004:41:00.0: reg 0x18: [mem 0x00000000-0x0000ffff 64bit]
[    6.195520] pci 0004:41:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit]
[    6.195891] pci 0004:41:00.0: supports D1 D2
[    6.195904] pci 0004:41:00.0: PME# supported from D0 D1 D2 D3hot D3cold
[    6.213551] pci_bus 0004:41: busn_res: [bus 41-4f] end is updated to 41
[    6.213826] pci 0004:40:00.0: BAR 8: assigned [mem 0xf4200000-0xf42fffff]
[    6.213853] pci 0004:40:00.0: BAR 6: assigned [mem 0xf4300000-0xf430ffff pref]
[    6.213880] pci 0004:40:00.0: BAR 7: assigned [io  0x200000-0x200fff]
[    6.213964] pci 0004:41:00.0: BAR 2: assigned [mem 0xf4200000-0xf420ffff 64bit]
[    6.214051] pci 0004:41:00.0: BAR 4: assigned [mem 0xf4210000-0xf4213fff 64bit]
[    6.214103] pci 0004:

I have observed a similar thing, booting from an intel optane. u-boot loads the kernel from NVMe, but during linux kernel init the board reboots. had the situation were this was going on for quite some time, but after a few power down and power up cycles, i usually only get one reboot and then the kernel loads fine. Looks almost like some consumption peeks during the initial boot which are not there / not so bad during the second boot attempt ¯\_(ツ)_/¯. Haven’t found a specific point in the kernel boot logs were the system resets. By now it ‘kinda works’ as in: i’m powering the board on and its boots quite consistent, with maybe one quick reboot on the way.
Using a patched u-boot (for 2x2 pcie3) from here How to build u-boot spi images

If you want to enable pd negotiation, you have to also add command charge_pd to u-boot script. If pd negotiation starts more than 5 seconds after the power plugged in, there will be a hard reset at the pd power, which will cause boot loop.

2 Likes

Do you have voltage monitor/meter? If yes, you can check if the voltage is negotiated to higher voltage during the boot.

1 Like

I assume you are referring to this command, but tbh I don’t know how to use/enable it.

Command is not available. :confused:

UPD:
Looking at Makefile

cmd/Makefile:obj-$(CONFIG_CMD_CHARGE_DISPLAY) += charge.o

I need to define CONFIG_CMD_CHARGE_DISPLAY and since it depends on DM_CHARGE_DISPLAY, adding following to ./u-boot/configs/rock-5b-rk3588_defconfig

CONFIG_CMD_CHARGE_DISPLAY=y
CONFIG_DM_CHARGE_DISPLAY=y

Now, I need to figure out how to use it:

=> charge_pd
ret = -96 (dev => 0000000000000000)
=> charge_pd 20000 5000        
charge_pd - Charge select

Usage:
charge_pd -charge_pd
-charge_pd <voltage wanted> <current wanted> 

ret -96 is error code from uclass_get_device.

according to errors it’s Protocol family not supported error. Regardless of charger

Just run it like other u-boot cmd. On armbian I add it to boot.cmd and then compile the file to boot.scr.

It seems like work regarding Power Delivery wasn’t finished for u-boot. Even if uclass_get_device would return device handle, there is no code, just printf

ret = uclass_get_device(UCLASS_PD, 0, &dev);
	printf("ret = %d (dev => %p)\n", ret, dev);

	return 0;

@Stephen is it planned to fix this anytime soon, or I should rather buy a few dumb power supplies?