ZFSのslog追加時もashift量に注意しよう

zpool statusしたら「block size: 512B configured, 4096B native」なるメッセージが出ていた。

$ zpool status zhome
  pool: zhome
 state: ONLINE
status: One or more devices are configured to use a non-native block size.
        Expect reduced performance.
action: Replace affected devices with devices that support the
        configured block size, or migrate data to a properly configured
        pool.
  scan: resilvered 389G in 2h31m with 0 errors on Sun Mar 29 16:58:56 2015
config:

        NAME        STATE     READ WRITE CKSUM
        zhome       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada0p1  ONLINE       0     0     0
            ada3p1  ONLINE       0     0     0
        logs
          mirror-1  ONLINE       0     0     0
            ada1p3  ONLINE       0     0     0  block size: 512B configured, 4096B native
            ada9p3  ONLINE       0     0     0  block size: 512B configured, 4096B native
        cache
          ada9p5    ONLINE       0     0     0

物理4kセクタのデバイスに512バイト単位でアクセスしてて性能低下してるかもよっていうアレ。わざわざ報告してくれるなんてZFSは神。てか、slogのブロックサイズはプール本体のashiftと連動しないんすね…

というわけで、一旦slogを消し、例によってgnopで4kセクタ化してslogを作りなおした。

$ sudo zpool remove zhome mirror-1 
$ zpool status zhome
  pool: zhome
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: resilvered 389G in 2h31m with 0 errors on Sun Mar 29 16:58:56 2015
config:
 
        NAME        STATE     READ WRITE CKSUM
        zhome       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            ada0p1  ONLINE       0     0     0
            ada3p1  ONLINE       0     0     0
        cache
          ada9p5    ONLINE       0     0     0
 
errors: No known data errors
 
$ sudo gnop create -S 4096 /dev/ada1p3
$ sudo gnop create -S 4096 /dev/ada9p3
$ sudo zpool add zhome log mirror ada1p3.nop ada9p3.nop
$ zpool status zhome
  pool: zhome
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: resilvered 389G in 2h31m with 0 errors on Sun Mar 29 16:58:56 2015
config:
 
        NAME            STATE     READ WRITE CKSUM
        zhome           ONLINE       0     0     0
          mirror-0      ONLINE       0     0     0
            ada0p1      ONLINE       0     0     0
            ada3p1      ONLINE       0     0     0
        logs
          mirror-1      ONLINE       0     0     0
            ada1p3.nop  ONLINE       0     0     0
            ada9p3.nop  ONLINE       0     0     0
        cache
          ada9p5        ONLINE       0     0     0
 
errors: No known data errors

一応、zdbで確認(ログの必要箇所のみ抜粋)

children[1]:
    type: 'mirror'
    id: 1
    guid: 12556410678957656274
    metaslab_array: 52
    metaslab_shift: 26
    ashift: 12
    asize: 8585216000
    is_log: 1
    create_txg: 13992340
    children[0]:
        type: 'disk'
        id: 0
        guid: 10904489644541021375
        path: '/dev/ada1p3.nop'
        phys_path: '/dev/ada1p3.nop'
        whole_disk: 1
        create_txg: 13992340
    children[1]:
        type: 'disk'
        id: 1
        guid: 18335897789017456858
        path: '/dev/ada9p3.nop'
        phys_path: '/dev/ada9p3.nop'
        whole_disk: 1
        create_txg: 13992340

ashift: 12になってるし、これで大丈夫だろう。nopデバイスは次に再起動した時に勝手に消えるだろうから、放置で。

ついでに、もう1個のプールも確認してみると…

$ zpool status zdata
  pool: zdata
 state: ONLINE
status: One or more devices are configured to use a non-native block size.
        Expect reduced performance.
action: Replace affected devices with devices that support the
        configured block size, or migrate data to a properly configured
        pool.
  scan: scrub canceled on Sun Mar 22 10:40:27 2015
config:

        NAME         STATE     READ WRITE CKSUM
        zdata        ONLINE       0     0     0
          raidz1-0   ONLINE       0     0     0
            ada5p1   ONLINE       0     0     0
            ada10p1  ONLINE       0     0     0
            ada12p1  ONLINE       0     0     0
          raidz1-1   ONLINE       0     0     0
            ada14p1  ONLINE       0     0     0  block size: 512B configured, 4096B native
            ada8p1   ONLINE       0     0     0  block size: 512B configured, 4096B native
            ada15p1  ONLINE       0     0     0  block size: 512B configured, 4096B native
        logs
          mirror-2   ONLINE       0     0     0
            ada1p4   ONLINE       0     0     0  block size: 512B configured, 4096B native
            ada9p4   ONLINE       0     0     0  block size: 512B configured, 4096B native
        cache
          ada1p5     ONLINE       0     0     0
        spares
          ada2p1     AVAIL

errors: No known data errors

ぉぉぉぉ、、、RAID-Zストライピングの片割れraidz1-1が見事にashitf=9になってやがる…。raidz1-1は後からashiftを考慮せずに、というよりもashiftがvdev単位?なのを知らずに追加したものなので当然っちゃ当然だが……。AFTには気をつけて作業して来たつもりだけど、こんな罠があったとはねorz

にしても困った。zdataは3TB HDD×6本のプールなので、先のslogみたいに気軽に削除&作り直しって訳にもいかんしなぁ。raidz1-1にashift=12のミラーを追加して・・・とも思ったが、ミラーのashiftはraidz1-1のが継承されそうな気がする。そもそもraidz1-1をミラーに出来るかも分からんし。もしかして詰んでる?




  • blog/2015/2015-05-20.txt
  • 最終更新: 2015-05-20 16:27
  • by Decomo