5 Replies Latest reply on Apr 12, 2018 5:44 PM by Cindys-Oracle

    vdev removal, poolwide checkpoints/snaps or improved recovery of damaged pools

    899664

      Solaris was and is the most feature rich ZFS platform

      Any chance to see these new improvements of Open-ZFS in Solaris 11.4 like it was with LZ4 in last release?

        • 1. Re: vdev removal, poolwide checkpoints/snaps or improved recovery of damaged pools
          Cindys-Oracle

          I agree!

           

          Solaris 11.4 beta has a robust set of ZFS features that you can read about in my blog here:

           

          https://blogs.oracle.com/zfs/oracle-solaris-114-data-management-features

           

          This is just a partial list of high-level, visible stuff and more is coming in the S11.4 beta refresh release.

           

          You can read about S11.4 device removal features from this blog:

           

          Robert Milkowski's blog: ZFS: Device Removal

           

          Its not entirely clear to me that poolwide checkpoints are necessary in Solaris/ZFS land if their purpose is to be able to more easily rollback during OS upgrades since Solaris/ZFS supports bootable OS environments (BEs). You can either rollback the BE snapshot, the BE itself, or even mount the BE and copy files that you need. Let me know if I'm missing something.

           

          We continue to improve pool recovery and I covered this topic recently:

           

          https://blogs.oracle.com/zfs/zfs-data-and-pool-recovery

           

          We'll be updating blogs with new ZFS features when S11.4 beta release is refreshed soon.

           

          Thanks, Cindy

          • 2. Re: vdev removal, poolwide checkpoints/snaps or improved recovery of damaged pools
            899664

            The poolwide checkpoint feature is much more than a BE.
            It is the whole state of a pool. Even if you add or destroy a vdev you can go back to this state during a pool import.

             

            compare

            https://www.illumos.org/issues/9166

            Pool recovery seems also new
            https://www.delphix.com/de/node/1231

             

            btw
            Open-ZFS  cannot remove a vdev if there is a raidz[1-3] in the pool and
            when you have removed a vdev you cannot add a Z[1-3] afterwards

             

            Is this limitation also in Solaris?

            • 3. Re: vdev removal, poolwide checkpoints/snaps or improved recovery of damaged pools
              Cindys-Oracle

              Fair enough.

               

              I see recovery tools developed by our support team that are not necessarily exposed externally.

               

              I'm unsure about your RAIDZ device removal question. Are you talking about removing and re-adding a top-level VDEV of 3 devices or being able to mix VDEV sizes and correcting that configuration? See the examples below of various removal scenarios.

               

              Thanks, Cindy

               

              # zpool status rzpool

                pool: rzpool

              state: ONLINE

                scan: none requested

              config:

               

                      NAME        STATE      READ WRITE CKSUM

                      rzpool      ONLINE        0    0    0

                        raidz1-0  ONLINE        0    0    0

                          c5t3d0  ONLINE        0    0    0

                          c2t7d0  ONLINE        0    0    0

                          c1t3d0  ONLINE        0    0    0

                        raidz1-1  ONLINE        0    0    0

                          c5t7d0  ONLINE        0    0    0

                          c2t3d0  ONLINE        0    0    0

                          c1t7d0  ONLINE        0    0    0

               

              errors: No known data errors

              # zpool remove rzpool raidz1-1

              # zpool status rzpool

                pool: rzpool

              state: ONLINE

                scan: resilvered 1K in 1s with 0 errors on Wed Apr 11 16:49:30 2018

               

              config:

               

                      NAME                      STATE      READ WRITE CKSUM

                      rzpool                    ONLINE        0    0    0

                        raidz1-0                ONLINE        0    0    0

                          c5t3d0                ONLINE        0    0    0

                          c2t7d0                ONLINE        0    0    0

                          c1t3d0                ONLINE        0    0    0

               

              errors: No known data errors

              # zpool add rzpool raidz c5t7d0 c2t3d0 c1t7d0

              # zpool status rzpool

                pool: rzpool

              state: ONLINE

                scan: resilvered 1K in 1s with 0 errors on Wed Apr 11 16:49:30 2018

               

              config:

               

                      NAME                      STATE      READ WRITE CKSUM

                      rzpool                    ONLINE        0    0    0

                        raidz1-0                ONLINE        0    0    0

                          c5t3d0                ONLINE        0    0    0

                          c2t7d0                ONLINE        0    0    0

                          c1t3d0                ONLINE        0    0    0

                        raidz1-2                ONLINE        0    0    0

                          c5t7d0                ONLINE        0    0    0

                          c2t3d0                ONLINE        0    0    0

                          c1t7d0                ONLINE        0    0    0

               

              errors: No known data errors

              # zpool remove rzpool raidz1-0 

              root@x4500-brm-03:~# zpool status rzpool

                pool: rzpool

              state: ONLINE

                scan: resilvered 28K in 1s with 0 errors on Wed Apr 11 17:01:12 2018

               

              config:

               

                      NAME                      STATE      READ WRITE CKSUM

                      rzpool                    ONLINE        0     0     0

                        raidz1-2                ONLINE        0     0     0

                          c5t7d0                ONLINE        0     0     0

                          c2t3d0                ONLINE        0     0     0

                          c1t7d0                ONLINE        0     0     0

               

              errors: No known data errors

              Add wrong number of devices:

              # zpool add rzpool raidz c5t3d0 c2t7d0

              # zpool status rzpool

                pool: rzpool

              state: ONLINE

                scan: resilvered 28K in 1s with 0 errors on Wed Apr 11 17:01:12 2018

               

              config:

               

                      NAME                      STATE      READ WRITE CKSUM

                      rzpool                    ONLINE        0    0    0

                        raidz1-2                ONLINE        0    0    0

                          c5t7d0                ONLINE        0    0    0

                          c2t3d0                ONLINE        0    0    0

                          c1t7d0                ONLINE        0    0    0

                        raidz1-3                ONLINE        0    0    0

                          c5t3d0                ONLINE        0    0    0

                          c2t7d0                ONLINE        0    0    0

               

              errors: No known data errors

              # zpool remove rzpool raidz1-3

              # zpool status rzpool

                pool: rzpool

              state: ONLINE

                scan: resilvered 512 in 1s with 0 errors on Wed Apr 11 17:02:23 2018

               

              config:

               

                      NAME                      STATE      READ WRITE CKSUM

                      rzpool                    ONLINE        0    0    0

                        raidz1-2                ONLINE        0    0    0

                          c5t7d0                ONLINE        0    0    0

                          c2t3d0                ONLINE        0    0    0

                          c1t7d0                ONLINE        0    0    0

               

              errors: No known data errors

              • 4. Re: vdev removal, poolwide checkpoints/snaps or improved recovery of damaged pools
                899664

                Marvellous

                 

                Open-ZFS, at least currently lacks the support for a vdev remove of a basic or mirror vdev when a raid-z [1-3] vdev is part of the pool or a remove raid-Z[1-3] at all or add a raid-z [1-3] after a remove of ex a basic/mirror vdev what limits its use cases. Support of raid-Z [2-3] is expected in Open-ZFS (but not Z1)


                Open-ZFS ex OmniOS that is the first to include this feature also requires a re-mapping table with a continous small RAM need/reservation and small performance degration.

                 

                Is it the case that Solaris 11.4 does not suffer from such.
                At least in a zpool status there is no hint about (Open-ZFS writes a note about a former vdev remove and the required RAM into the zpool status header similar to the scan: note)

                 

                update
                A manual zpool remap on Open-ZFS can remove the needed  remapping.
                I suppose Solaris does this automatically?

                • 5. Re: vdev removal, poolwide checkpoints/snaps or improved recovery of damaged pools
                  Cindys-Oracle

                  Yes, we do all the device remapping automatically.

                   

                  Thanks, Cindy