Experience report with BTRFS and boot from HAT

Hi there, I have a problem to set up Ubuntu Server 22.04 on my Raspberry. I use a USB-Stick that contains the boot partition. I connected a SATA SSD to the first SATA connector of the HAT. It contains the root file system that is in a LUKS container. When the Raspberry boots, it asks for the password. If I enter the password, at first everything seems fine. However, after a while one can see some failures and it ends in a log spam, like this:
BTRFS error (device dm-0): bdev /dev/mapper/crypt_raspi errs: wr 6, rd 170, flush 0, corrupt 0, gen 0

Any ideas how to solve it? Is the HAT capable of handling a setup like this? Tomorrow I would like to try this setup with EXT4, but I believe it will result in a similar experience.

Thank you!

I canā€™t believe I did it. I spent a whole week on this problem. So that maybe others can learn from it, Iā€™m writing down my experience. For me as a reminder in the future.

Initial goal:
Connect SATA HAT to Raspberry Pi. Connect SSD to SATA HAT and boot from it.

Challenge 1:
I came to the conclusion that it is not possible to boot directly from the SSD. Because first the controller of the HAT must be activated and that is only possible if e.g. in the config.txt the two GPIOs are set.

Approach 1:
I went a more complicated way. A USB stick which was connected to the USB 2.0 port of the Raspberry should contain the boot partition. As soon as the HAT is activated at startup, the root partition should be mounted on the SSD.
I followed this tutorial hereĀ¹ and prepared the USB stick via Raspberry Pi Imager (Step 1). I skipped Step 2 and partitionedĀ² my SSD in Step 3. I then connected the SSD to the HAT. I booted the USB stick (Step 4) and updated the OS [Ubuntu Server] (Step 5). Here I already added the activation of the GPIOs in the config.txtĀ³ and additionally installed the SATA-Liteā“ service of the HAT. In advance, I already installed Dropbear*āµ (Step 9) and set it up. Then I created the subvolumes, on the SSD, again on the computer and copied [rsync] partition 2 of the USB stick to the SSD (Step 6). I created a chroot environment and changed fstab etc. in order to use the SSD as my root partition (Step 7-8). Then I left the chroot environment and rebooted. As planned, Dropbear appeared, and I was able to unlock the LUKS partition of the SSD.

Failure 1:
The problem started after unlocking the partition. First the bootloader messages were green, then they turned red, and it ended up reading something about BTRFS error in the terminal. I could not get into the system.

Failure 2-n
I tried it with an EXT4 partition, with SATA-Lite and without. Again and again I restored drive images to go the painful way again and at least a little faster. In the end, it always ended up crashing during the boot process. Interestingly, it was absolutely no problem at all when the SSD was connected to the Raspberry via USB-to-SATA and both boot and root were on the drive. I therefore ruled out a defect of the SSD. Also, BTRFS check etc. did not bring any findings. Unfortunately, troubleshooting was impossible because the errors were not written to the log files.

New goal
Activate the HAT when starting the Raspberry and boot from the SSD.
Certainty of death. Small chance of success. What are we waiting for?

Approach 2
Using the imager, I loaded Raspberry Pi OS onto another USB stick and booted it. I updated*ā¶ the bootloader with sudo rpi-eeprom-update -d -a to April this year. After a reboot, I added to the EEPROM to set the two GPIOs that start the HAT: sudo -E rpi-eeprom-config --edit

[config.txt]
gpio=25,25=op,dh
gpio=26,26=op,dh

I am convinced that number comma number is completely unnecessary, and even just the number would do. But never touch a running system. Afterwards, I switched off the Raspberry again. Already now you could see that directly after switching on the Raspberry the HAT was already active.
Now I connected the SSD to the computer via USB-to-SATA and selected the SSD in the imager. I did not want to touch the USB stick again. The imager wrote directly to the SSD this time. Before ejecting the SSD, I followed step 2 of the tutorial and then connected the SSD to the HAT. The Raspberry booted from the SSD, and I was in. Since I still had a slight suspicion at this point that Dropbear might be to blame for all the suffering so far, I skipped that part of the tutorial for now. Unlocking the LUKS partition is still possible by keyboard at the beginning. Attention. If you are not a native speaker, passwords on English keyboards can be hell! Therefore, I set up my language and keyboard before Step 6*ā·. After finishing the tutorial, where I didnā€™t install Dropbear for now, the reboot came. The LUKS partition unlockedā€¦ ā€¦and? It worked! It started! What a joy, what a pleasure! Hooray!
After this I installed Dropbear and no, Dropbear was not to blame for all this. Dropbear also installed without any problems, was able to unlock my partition via SSH and boot into the system.

Conclusion
I read somewhere - in the context of Raspberry and SD Cards, that mirroring [copying] (e.g. via rsync) would not have the same result as you would think. And yes, this person is right. In my opinion, what I achieved in Approach 1 is exactly the same as what I achieved in Approach 2. More or less. But, itā€™s still fundamentally differentā€¦ By the way, in Approach 2 I did not apply SATA-Lite, nor any customization of config or quirks.

Final words
I hope it will help someone out there, and I am eager to hear your opinion. For me, this is a great collection of knowledge for next time*.
*No, itā€™s enough for now!

*Ā¹ https://mutschler.dev/linux/raspi-btrfs/#step-9-optional-remote-unlocking-using-dropbear-ssh
*Ā² the alignment issue drove me crazy, both with my USB-to-SATA controller, and with the HAT itself. Which has the same problem. Read more: https://linux-blog.anracom.com/2018/12/03/linux-ssd-partition-alignment-problems-with-external-usb-to-sata-controllers-i/
*Ā³ gpio=25,26=op,dh
*ā“ https://forum.radxa.com/t/lightweight-installation-sata-hat-only-no-cpu-fan-no-lcd-no-buttons
*āµ the path of Dropbear has changed, it was hard to find out in my research: https://blog.gradiian.io/migrating-to-cockpit-part-i/
*ā¶ https://www.blattertech.ch/2022/04/eeprom-update-fuer-den-raspberry-pi-4/
*ā· https://mutschler.dev/linux/raspi-post-install/

1 Like