SATA HAT RAID 5 disk problem

Sheik · September 7, 2020, 6:33pm

Hello.
I have a Raspberry Pi 4 with a SATA HAT with 4 disks mounted:
1x SSD: (Dockers)
3x Seagate on RAID 5
The problem I have is that after a few days the raid gives error and expels one of the days. It is always the same (sdd). I add it again with the following command and it works again some days until the problem repeats.
sudo mdadm --add /dev/md0 /dev/sdd
Could you help me with what might be happening?
Attached is some information I have collected that may be of interest to you.
Thank you very much.

Current status of the raid after the error:
pi@RPi4-NAS:~ $ sudo mdadm --query --detail /dev/md0
/dev/md0:

```
      Version : 1.2*
```

Creation Time : Fri Jun 26 11:28:37 2020*

```
   Raid Level : raid5*
```

   Array Size : 1953260928 (1862.77 GiB 2000.14 GB)*

Used Dev Size : 976630464 (931.39 GiB 1000.07 GB)*

```
 Raid Devices : 3*
```
```
Total Devices : 2*
```

  Persistence : Superblock is persistent*

```
Intent Bitmap : Internal*
```

  Update Time : Mon Sep  7 03:00:55 2020*

```
        State : clean, degraded *
```
Active Devices : 2*
Working Devices : 2*
Failed Devices : 0*
```
Spare Devices : 0*
```
```
       Layout : left-symmetric*
```
```
   Chunk Size : 64K*
```

Consistency Policy : bitmap

         Name : RPi4-NAS:0  (local to host RPi4-NAS)*

         UUID : 356a6b5d:57d0cf44:ad0bece1:8a270779*

```
       Events : 2736*
```
Number Major Minor RaidDevice State*

  0       8       32        0      active sync   /dev/sdc*

  -       0        0        1      removed*

  3       8       64        2      active sync*

This is the alert that OMV sends when the error occurs:
This is an automatically generated mail message from mdadm
running on RPi4-NAS

A Fail event had been detected on md device /dev/md0.

It could be related to component device /dev/sdd.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

*Personalities : [raid6] [raid5] [raid4] *
md0 : active raid5 sdd1 sde[3] sdc[0]

1953260928 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/2] [U_U]*

bitmap: 0/8 pages [0KB], 65536KB chunk*

unused devices:

Devices:

setq · September 8, 2020, 1:34am

Hello, I want some dmesg.

Sheik · September 8, 2020, 7:11am

Hello setq. Thanks you for your answer.
I’m sorry. I don’t have dmesg output at the moment of the error. Yesterday after open this topic I force to recovery the Raid To avoid losing data if another disk fails.
I attached the dmesg output, I see that it takes since the last reboot. You can observe the process of reconstruction of the Raid.

If you need that information at the time of the error I will try to get it the next time It fails.
Thank you very much for your help.

Sheik · September 8, 2020, 8:20pm

Hello @setq
It failed again. This is the log where the I/O write failure appears

Thx