ZFSプールが手狭になったため、HDDを順繰り大容量のものにzpool replaceしてた。
$ zpool status zdata pool: zdata state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: resilvered 8.82T in 25h57m with 2 errors on Tue Jan 23 22:47:38 2018 config: NAME STATE READ WRITE CKSUM zdata ONLINE 0 0 2 raidz1-0 ONLINE 0 0 4 ada3p1 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 replacing-2 ONLINE 0 0 0 ada1p1 ONLINE 0 0 0 da0p1 ONLINE 0 0 0 ada5p1 ONLINE 0 0 0 logs mirror-1 ONLINE 0 0 0 ada6p5 ONLINE 0 0 0 ada7p5 ONLINE 0 0 0 errors: 2 data errors, use '-v' for a list
replace-2でada1p1をda0p1にreplaceしてた感じ。置き換え中のデバイスには(resilvering)表記が付くのだが、ご覧の通り上記のda0p1には付いてない。てなもんで、処理が終わったと判断し、ada1p1を切り離そうとしたら…
# zpool detach zdata ada1p1 cannot detach ada1p1: no valid replicas
と、言われてしまった所で日記タイトルなのである。
「有効なレプリカがない」とな…。ZFS暦7年目にして初めて目にするエラーでござる。errors: 2 data errors, use '-v' for a list
で報告された壊れているファイルを消してもダメぽ。それどころか、、、
# zpool status -v zdata pool: zdata state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: resilvered 8.82T in 25h57m with 2 errors on Tue Jan 23 22:47:38 2018 config: NAME STATE READ WRITE CKSUM zdata ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada3p1 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 replacing-2 ONLINE 0 0 0 ada1p1 ONLINE 0 0 0 da0p1 ONLINE 0 0 0 ada5p1 ONLINE 0 0 0 logs mirror-1 ONLINE 0 0 0 ada6p5 ONLINE 0 0 0 ada7p5 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: zdata/NFC/data/Decomo:<0x196090>
なんやねん、zdata/NFC/data/Decomo:<0x196090>
って。こうなったら神頼みのzpool scrubするしかない。
# zpool scrub zdata $ zpool status zdata pool: zdata state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub in progress since Tue Jan 23 23:39:17 2018 100G scanned out of 18.1T at 427M/s, 12h15m to go 25.0G repaired, 0.54% done config: NAME STATE READ WRITE CKSUM zdata ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada3p1 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 replacing-2 ONLINE 0 0 0 ada1p1 ONLINE 0 0 0 da0p1 ONLINE 0 0 0 (repairing) ada5p1 ONLINE 0 0 0 logs mirror-1 ONLINE 0 0 0 ada6p5 ONLINE 0 0 0 ada7p5 ONLINE 0 0 0
修復が始まった。replaceでのresilveringは一体なんだったのか…。しばらく放置してからプールの状態を見てみると、、、
$ zpool status zdata pool: zdata state: ONLINE status: Some supported features are not enabled on the pool. The pool can still be used, but some features are unavailable. action: Enable all features using 'zpool upgrade'. Once this is done, the pool may no longer be accessible by software that does not support the features. See zpool-features(7) for details. scan: scrub repaired 4.51T in 15h42m with 0 errors on Wed Jan 24 15:21:24 2018 config: NAME STATE READ WRITE CKSUM zdata ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada3p1 ONLINE 0 0 0 ada2p1 ONLINE 0 0 0 da0p1 ONLINE 0 0 0 ada5p1 ONLINE 0 0 0 logs mirror-1 ONLINE 0 0 0 ada6p5 ONLINE 0 0 0 ada7p5 ONLINE 0 0 0 errors: No known data errors
修復が終わりデバイスの置き換えまで行われてる!
今回得た知見: