Broken RAID10 on a Quad Sata Hat for Raspberry Pi

Hello,

I have been using the Quad SATA HAT with RAID10 set up for quite some time. There are 4 identical hard drives in use and so far everything worked without problems.
Now I noticed that one hard disk is no longer part of the RAID10. I have not noticed any other problems so far.

Since I am not a Linux guru, I would ask for some help in fixing the problem.

However, the hard disk sdb, which is not part of RAID10, is not defective. I cannot explain why it is so suddenly no longer integrated.

I’m not so sure how I set up the RAID10 back then, with mdadm, the jms56x raid controller console app or both.

I’ll post a few issues of the jms56x raid controller console app once:

JMS56X>GC

–> Total available JMS56X controllers = 2 <–

JMS56X>DC C0

Controller[0]
– ChipID = 10
– SerialNumber = 427491329
– SuperUserPwd = ▒▒▒▒▒▒▒▒
– Sata[0]
------ ModelName = ST2000LM015-2E8174
------ SerialNumber = WDZR8PTA
------ FirmwareVer = 0001
------ Capacity = 1863 GB
------ PortType = Hard Disk
------ PortSpeed = Gen 3
------ Page0State = Hooked to PM
------ Page0RaidIdx = 0
------ Page0MbrIdx = 0
– Sata[1]
------ ModelName = ST2000LM015-2E8174
------ SerialNumber = WDZR8QKT
------ FirmwareVer = 0001
------ Capacity = 1863 GB
------ PortType = Hard Disk
------ PortSpeed = Gen 3
------ Page0State = Hooked to PM
------ Page0RaidIdx = 0
------ Page0MbrIdx = 0

JMS56X>DC C1

Controller[1]
– ChipID = 11
– SerialNumber = 427491329
– SuperUserPwd = ▒▒▒▒▒▒▒▒
– Sata[0]
------ ModelName = ST2000LM015-2E8174
------ SerialNumber = WDZR5GQF
------ FirmwareVer = 0001
------ Capacity = 1863 GB
------ PortType = Hard Disk
------ PortSpeed = Gen 3
------ Page0State = Hooked to PM
------ Page0RaidIdx = 0
------ Page0MbrIdx = 0
– Sata[1]
------ ModelName = ST2000LM015-2E8174
------ SerialNumber = WDZR5NAX
------ FirmwareVer = 0001
------ Capacity = 1863 GB
------ PortType = Hard Disk
------ PortSpeed = Gen 3
------ Page0State = Hooked to PM
------ Page0RaidIdx = 0
------ Page0MbrIdx = 0

root@rpi-nas:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.8T 0 disk
└─sda1 8:1 0 1.8T 0 part
└─md0 9:0 0 3.6T 0 raid10 /srv/dev-disk-by-uuid-ea0838f8-5efe-44a3-8cf7-40fd60a73de0
sdb 8:16 0 1.8T 0 disk
└─sdb1 8:17 0 1.8T 0 part
sdc 8:32 0 1.8T 0 disk
└─sdc1 8:33 0 1.8T 0 part
└─md0 9:0 0 3.6T 0 raid10 /srv/dev-disk-by-uuid-ea0838f8-5efe-44a3-8cf7-40fd60a73de0
sdd 8:48 0 1.8T 0 disk
└─sdd1 8:49 0 1.8T 0 part
└─md0 9:0 0 3.6T 0 raid10 /srv/dev-disk-by-uuid-ea0838f8-5efe-44a3-8cf7-40fd60a73de0
mmcblk0 179:0 0 29.7G 0 disk
├─mmcblk0p1 179:1 0 256M 0 part /boot
└─mmcblk0p2 179:2 0 29.5G 0 part /

Of course, it would be great if I could mount the hard drive again without losing any data.

Here are a few more screenshots from Openmediavault:

Best regards, Carsten

According to Your info You have one software RAID managed by mdadm, hardware is visible in system as virtual H/W device. I was not able to create stable array with mdadm and this controllers, but two hardware raids are working with UAS.

Get status of MD array:

cat /proc/mdstat

Just re-add this drive to md0 array.

mdadm --stop /dev/md0
mdadm --assemble --force --verbose /dev/md0 /dev/sd[abcd]

if it say that it’s read only then do:

mdadm /dev/md0 -a /dev/sdb
mdadm --readwrite /dev/md0

then check out if it finished:

watch -n 1 -d "cat /proc/mdstat && lsblk"

Using

First of all, thank you for the answer. As I wrote in my first post, I am not so sure anymore how I had created the RAID10. Possibly I had also created 2 RAID0 with the controller software and then connected with mdadm two RAID0 to a RAID1 to get then quasi RAID10.

Anyway, your answer unfortunately came a little too late to test it now, because I was a little restless and deleted the RAID10 and now set up the 4 drives using Openmediavault and BTRFS.

It works but should no longer be a RAID array?

I selected BTRFS at OMV, could select the 4 drives (sda-sdd) and now appears a drive with twice the size of a single drive. On the fly it works.

Should I still feel like using a classic RAID10, could you describe again how I should best proceed?

Thanks!

According to screens You had software only array with one missing drive. Take a look at drives view - You would see there H/W RAID drives.

I did not tested BTRFS yet and can’t say anything about it. There are few other ideas how to connect everything having some redundancy and speed as well as low CPU usage on high speed transfers. Also power saving and reporting is important - You don’t want to know about damaged drive only via low transfers.

I had RAID6 volume from all drives, but that was not stable and caused high cpu usage. Also transfers were rather low. Eventually I decided to do two hardware RAID volumes, even if they are disconnected - they come back without any issues. So far it’s not disconnecting anything and CPU usage is low. The only thing I need to work on is probably SMART reporting. Of course it’s not the only solution and maybe Your is even better.

Whether the creation of the drive in OMV with BTRFS alone created the construct, or if there were configurations from before involved, I seem to have a RAID10 with BTRFS.

Sometimes luck is also with the stupid.:wink:

I would say quite ok for a Raspberry Pi.

All I can say at the moment is, it works!

Regards