数年前に構築した RAID(500GBの RAID1+スペア)が手狭になってきたので 2Tのディスクを使った同じ構成にアップグレード。
用意したのは WDの 2T、WD20EARXを三台。*1
まずはパーティションを切る。(/dev/sdc /dev/sdd)
root@lachesis:~# fdisk -u /dev/sdc
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x592756ff.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won’t be recoverable.Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
The device presents a logical sector size that is smaller than
the physical sector size. Aligning to a physical sector (or optimal
I/O) size boundary is recommended, or performance may be impacted.WARNING: DOS-compatible mode is deprecated. It’s strongly recommended to
switch off the mode (command ‘c’).Command (m for help): p
Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x592756ffDevice Boot Start End Blocks Id System
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First sector (63-3907029167, default 64): 64
Last sector, +sectors or +size{K,M,G} (64-3907029167, default 3907029167):
Using default value 3907029167Command (m for help): p
Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x592756ffDevice Boot Start End Blocks Id System
/dev/sdc1 64 3907029167 1953514552 83 LinuxCommand (m for help): w
The partition table has been altered!Calling ioctl() to re-read partition table.
Syncing disks.
root@lachesis:~# fdisk -u /dev/sdc
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x594661ea.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won’t be recoverable.Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
The device presents a logical sector size that is smaller than
the physical sector size. Aligning to a physical sector (or optimal
I/O) size boundary is recommended, or performance may be impacted.WARNING: DOS-compatible mode is deprecated. It’s strongly recommended to
switch off the mode (command ‘c’).Command (m for help): p
Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x594661eaDevice Boot Start End Blocks Id System
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First sector (63-3907029167, default 64):
Using default value 64
Last sector, +sectors or +size{K,M,G} (64-3907029167, default 3907029167):
Using default value 3907029167Command (m for help): p
Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x594661eaDevice Boot Start End Blocks Id System
/dev/sdd1 64 3907029167 1953514552 83 LinuxCommand (m for help): w
The partition table has been altered!Calling ioctl() to re-read partition table.
Syncing disks.
次いで RAIDの作成。
root@lachesis:~# mdadm -C /dev/md1 -l1 -n2 /dev/sdc1 /dev/sdd1*2
作成完了をひたすら待つ。*3
root@lachesis:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md1 : active raid1 sdd1[1] sdc1[0]
1953513392 blocks super 1.2 [2/2] [UU]
[>………………..] resync = 0.1% (2376768/1953513392) finish=670.1min speed=48524K/sec
12時間後、元データのコピーをし、再起動…md126とか md127って何ですか?
root@lachesis:/etc# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md126 : inactive sdc1[0](S)
1953513528 blocks super 1.2md127 : active (auto-read-only) raid1 sdd[1]
1953513424 blocks super 1.2 [2/1] [_U]md0 : active raid1 sde1[0]
488383936 blocks [2/1] [U_]unused devices:
調べてみると、「mdadm 3.1.4にはバグ有り」という情報が。
手元の環境は…3.1.5だね。
「slackpkg update ; slackpkg upgrade mdadm」してみると、3.2.5に更新されている…が、状況変わらず。
「mdadm –detail –scan」の結果を「/etc/mdadm.conf」に追記というのも試してみたけど効果なし。
以前の RAIDと何が違うのか考えてみると、メタデータのバージョンと容量ぐらいしか。
容量はどうしようもないのでメタデータのバージョンを 0.9にして再構築してみたところ…ビンゴ。
0.90だとボリューム数やボリュームサイズに制限*4 があるようだけど、今回は問題ないね。
mdadm -C /dev/md1 -l1 -n2 –metadata=0.9 /dev/sdc1 /dev/sdd1
再起動後も /dev/md1が保たれていることが確認できたので resyncが終わるまで半日待ち。
最後にスペアディスクを追加。
root@lachesis:~# mdadm /dev/md1 –add /dev/sdf1
mdadm: added /dev/sdf1
RAIDの状態を確認して終了。
root@lachesis:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md1 : active raid1 sdf1[2](S) sde1[1] sdc1[0]
1953513472 blocks [2/2] [UU]unused devices: <none>
2012/09/11追記
再起動を行ったら /dev/md127が再度出現。
最後にスペアとして追加したディスクがさまよい出たらしい。
んーむ…
2012/09/15追記
さまよい出たディスクを単体で RAID1に仕立てる。
root@lachesis:~# mdadm -S /dev/md127
root@lachesis:~# mdadm -C /dev/md127 -l1 -n2 /dev/sde1 missing
再起動してやるとリードオンリーになってる。
片肺死んでいる状態で作成したせいかな?
root@lachesis:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md127 : active (auto-read-only) raid1 sde1[0]
1953382336 blocks super 1.2 [2/1] [U_]
md1 : active raid1 sdd1[1] sdc1[0]
1953513472 blocks [2/2] [UU]
強制的に RWに設定して初期化してみる。
root@lachesis:~# mdadm –readwrite /dev/md127
2012/09/17追記
/dev/md1から一台抜いて、/dev/md127に追加& resync
root@lachesis:~# mdadm /dev/md1 –fail /dev/sdd1
mdadm: set /dev/sdd1 faulty in /dev/md1
root@lachesis:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md127 : active raid1 sde1[0]
1953382336 blocks super 1.2 [2/1] [U_]md1 : active raid1 sdd1[2](F) sdc1[0]
1953513472 blocks [2/1] [U_]unused devices: <none>
root@lachesis:~# mdadm /dev/md1 –remove /dev/sdd1
mdadm: hot removed /dev/sdd1 from /dev/md1
root@lachesis:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md127 : active raid1 sde1[0]
1953382336 blocks super 1.2 [2/1] [U_]md1 : active raid1 sdc1[0]
1953513472 blocks [2/1] [U_]unused devices: <none>
root@lachesis:~# mdadm /dev/md127 –add /dev/sdd1
mdadm: added /dev/sdd1
root@lachesis:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md127 : active raid1 sdd1[2] sde1[0]
1953382336 blocks super 1.2 [2/1] [U_]
[>………………..] recovery = 0.0% (74496/1953382336) finish=2184.5min speed=14899K/secmd1 : active raid1 sdc1[0]
1953513472 blocks [2/1] [U_]unused devices: <none>
再起動してみると…駄目だ。
元から /dev/md127にあった /dev/sde1が見えなくなって、/dev/sdd*5 だけの片肺&auto-readonlyで状況が変わらない。
抜いたディスクを元に戻して再び悩む。
root@lachesis:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md127 : active (auto-read-only) raid1 sdd[1]
1953513424 blocks super 1.2 [2/1] [_U]md1 : active raid1 sdc1[0]
1953513472 blocks [2/1] [U_]