====== zpool detachしようとしたらno valid replicasと言われたでござる ======
ZFSプールが手狭になったため、HDDを順繰り大容量のものにzpool replaceしてた。
$ zpool status zdata
pool: zdata
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: resilvered 8.82T in 25h57m with 2 errors on Tue Jan 23 22:47:38 2018
config:
NAME STATE READ WRITE CKSUM
zdata ONLINE 0 0 2
raidz1-0 ONLINE 0 0 4
ada3p1 ONLINE 0 0 0
ada2p1 ONLINE 0 0 0
replacing-2 ONLINE 0 0 0
ada1p1 ONLINE 0 0 0
da0p1 ONLINE 0 0 0
ada5p1 ONLINE 0 0 0
logs
mirror-1 ONLINE 0 0 0
ada6p5 ONLINE 0 0 0
ada7p5 ONLINE 0 0 0
errors: 2 data errors, use '-v' for a list
replace-2でada1p1をda0p1にreplaceしてた感じ。置き換え中のデバイスには(resilvering)表記が付くのだが、ご覧の通り上記のda0p1には付いてない。てなもんで、処理が終わったと判断し、ada1p1を切り離そうとしたら…
# zpool detach zdata ada1p1
cannot detach ada1p1: no valid replicas
と、言われてしまった所で日記タイトルなのである。
~~READMORE~~
「有効なレプリカがない」とな…。ZFS暦7年目にして初めて目にするエラーでござる。''errors: 2 data errors, use '-v' for a list''で報告された壊れているファイルを消してもダメぽ。それどころか、、、
# zpool status -v zdata
pool: zdata
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: resilvered 8.82T in 25h57m with 2 errors on Tue Jan 23 22:47:38 2018
config:
NAME STATE READ WRITE CKSUM
zdata ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada3p1 ONLINE 0 0 0
ada2p1 ONLINE 0 0 0
replacing-2 ONLINE 0 0 0
ada1p1 ONLINE 0 0 0
da0p1 ONLINE 0 0 0
ada5p1 ONLINE 0 0 0
logs
mirror-1 ONLINE 0 0 0
ada6p5 ONLINE 0 0 0
ada7p5 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
zdata/NFC/data/Decomo:<0x196090>
なんやねん、''zdata/NFC/data/Decomo:<0x196090>''って。こうなったら神頼みのzpool scrubするしかない。
# zpool scrub zdata
$ zpool status zdata
pool: zdata
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: scrub in progress since Tue Jan 23 23:39:17 2018
100G scanned out of 18.1T at 427M/s, 12h15m to go
25.0G repaired, 0.54% done
config:
NAME STATE READ WRITE CKSUM
zdata ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada3p1 ONLINE 0 0 0
ada2p1 ONLINE 0 0 0
replacing-2 ONLINE 0 0 0
ada1p1 ONLINE 0 0 0
da0p1 ONLINE 0 0 0 (repairing)
ada5p1 ONLINE 0 0 0
logs
mirror-1 ONLINE 0 0 0
ada6p5 ONLINE 0 0 0
ada7p5 ONLINE 0 0 0
修復が始まった。replaceでのresilveringは一体なんだったのか…。しばらく放置してからプールの状態を見てみると、、、
$ zpool status zdata
pool: zdata
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
scan: scrub repaired 4.51T in 15h42m with 0 errors on Wed Jan 24 15:21:24 2018
config:
NAME STATE READ WRITE CKSUM
zdata ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ada3p1 ONLINE 0 0 0
ada2p1 ONLINE 0 0 0
da0p1 ONLINE 0 0 0
ada5p1 ONLINE 0 0 0
logs
mirror-1 ONLINE 0 0 0
ada6p5 ONLINE 0 0 0
ada7p5 ONLINE 0 0 0
errors: No known data errors
修復が終わりデバイスの置き換えまで行われてる!
今回得た知見:
* zpool replaceでは、新しいデバイスの再同期が終わると旧デバイスが自動でdetachされる
* 可能ならzpool replace前にscrubでプールの状態を確認するのが吉