RAID1 via software degradato —
Una mattina mi trovo nelle mail di sistema una cosa del genere:
This is an automatically generated mail message from mdadm running on libeccio A DegradedArray event had been detected on md device /dev/md/4. Faithfully yours, etc. P.S. The /proc/mdstat file currently contains the following: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md0 : active raid1 sda1[0] 487104 blocks super 1.2 [2/1] [U_] md5 : active raid1 sda7[0] 283077632 blocks super 1.2 [2/1] [U_] bitmap: 2/3 pages [8KB], 65536KB chunk md4 : active raid1 sda6[0] 4975616 blocks super 1.2 [2/1] [U_] md2 : active raid1 sda3[0] 97590272 blocks super 1.2 [2/1] [U_] md1 : active raid1 sda2[0] sdb2[1] 4390912 blocks super 1.2 [2/2] [UU] md3 : active raid1 sda5[0] 97589248 blocks super 1.2 [2/1] [U_] unused devices: <none> Oh cacchio... una ricerca in dmssg mi dice: md: bind<sda2> [ 1.778287] hub 2-10:1.0: 2 ports detected [ 1.779816] md: raid1 personality registered for level 1 [ 1.780237] md/raid1:md1: active with 2 out of 2 mirrors [ 1.780265] md1: detected capacity change from 0 to 4496293888 [ 1.790760] md: bind<sda3> [ 1.792595] md/raid1:md2: active with 1 out of 2 mirrors [ 1.792630] md2: detected capacity change from 0 to 99932438528 [ 1.829295] md: bind<sda6> [ 1.830683] md: bind<sdb7> [ 1.831317] md/raid1:md4: active with 1 out of 2 mirrors [ 1.831348] md4: detected capacity change from 0 to 5095030784 [ 1.832241] md: bind<sdb1> [ 1.858850] md: bind<sda7> [ 1.859793] md: kicking non-fresh sdb7 from array! [ 1.859797] md: unbind<sdb7> [ 1.879169] md: export_rdev(sdb7) [ 1.880522] md/raid1:md5: active with 1 out of 2 mirrors [ 1.982202] md: bind<sda1> [ 2.065159] md: kicking non-fresh sdb1 from array! [ 2.065165] md: unbind<sdb1> [ 2.073287] md: bind<sda5> [ 2.073889] md: kicking non-fresh sdb5 from array! [ 2.073893] md: unbind<sdb5> [ 2.079430] md: export_rdev(sdb1) [ 2.080728] md/raid1:md0: active with 1 out of 2 mirrors [ 2.080755] md0: detected capacity change from 0 to 498794496 [ 2.084402] created bitmap (3 pages) for device md5 [ 2.084601] md5: bitmap initialized from disk: read 1 pages, set 3 of 4320 bits [ 2.091447] md: export_rdev(sdb5) bla bla bla È la prima volta che mi succede e non so come comportarmi, smartctl mi dice che i dischi sono a posto quindi almeno da quel lato va bene. Guardo brevemente in Internet e scopro che la cosa succede se per esempio la macchina va giù male per mancanza di corrente, e la soluzione è relativamente semplice, cacci fuori dall'array le partizioni degradate e ce le rimetti che da solo si fa il rebuild. Quindi procedo in questo modo: root@libeccio:/var/log# /sbin/mdadm /dev/md0 --fail /dev/sdb1 --remove /dev/sdb1 mdadm: set device faulty failed for /dev/sdb1: No such device root@libeccio:/var/log# /sbin/mdadm /dev/md0 --add /dev/sdb1 mdadm: added /dev/sdb1
e lo stesso per per gli altri md con relative partizioni.
Facendo un cat di /proc/mdstat mi trovo in questa situazione:
root@libeccio:/var/log# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdb1[2] sda1[0]
487104 blocks super 1.2 [2/2] [UU]
md5 : active raid1 sdb7[2] sda7[0]
283077632 blocks super 1.2 [2/1] [U_]
[==>………………] recovery = 14.1% (40111104/283077632) finish=34.4min speed=117694K/sec
bitmap: 1/3 pages [4KB], 65536KB chunk
md4 : active raid1 sdb6[2] sda6[0]
4975616 blocks super 1.2 [2/1] [U_]
resync=DELAYED
md2 : active raid1 sdb3[2] sda3[0]
97590272 blocks super 1.2 [2/1] [U_]
resync=DELAYED
md1 : active raid1 sda2[0] sdb2[1]
4390912 blocks super 1.2 [2/2] [UU]
md3 : active raid1 sdb5[2] sda5[0]
97589248 blocks super 1.2 [2/1] [U_]
resync=DELAYED
unused devices: <none>
Quindi tutto sta tornando a posto, e anche per oggi ho imparato una cosa nuova.
Comments are disabled on this post