1 Reply Latest reply: Nov 7, 2012 12:12 PM by 898481 RSS

    Solaris ZFS partition/write alignment

    898481
      On a Solaris 11 latest patches, I did some tests but something is not clear to me.

      I did partition three disks (512b sector) as follows:

      0. c0t5000C5000568999Bd0 <SEAGATE-ST3600057SS-0008-558.91GB>
      /scsi_vhci/disk@g5000c5000568999b
      /dev/chassis/LSI-CORP-SAS2X36.500304800000007f/Slot_01/disk
      1. c0t5000C5000568B493d0 <SEAGATE-ST3600057SS-0008-558.91GB>
      /scsi_vhci/disk@g5000c5000568b493
      /dev/chassis/LSI-CORP-SAS2X36.500304800000007f/Slot_02/disk
      2. c0t5000C50005689AAFd0 <SEAGATE-ST3600057SS-0008-558.91GB>
      /scsi_vhci/disk@g5000c50005689aaf
      /dev/chassis/LSI-CORP-SAS2X36.500304800000007f/Slot_03/disk

      Partitioning data:

      Total disk sectors available: 1172107117 + 16384 (reserved sectors)

      Part Tag Flag First Sector Size Last Sector
      0 usr wm 256 558.90GB 1172107150
      1 unassigned wm 0 0 0
      2 unassigned wm 0 0 0
      3 unassigned wm 0 0 0
      4 unassigned wm 0 0 0
      5 unassigned wm 0 0 0
      6 unassigned wm 0 0 0
      7 unassigned wm 0 0 0
      8 reserved wm 1172107151 8.00MB 1172123534

      Then used the script in the article (http://www.oracle.com/technetwork/articles/systems-hardware-architecture/lun-alignment-163801.pdf) to monitor alignment.

      TEST 1: writing directly to disk partition
      dd if=/dev/zero of=/dev/rdsk/c0t5000C5000568999Bd0p0 bs=8k count=16384

      The aforementioned script to monitor alignment gives correct alignment: all blocks are written on boundaries, no misaligned writes

      TEST 2: writing to a zfs

      Created a zpool with:
      zpool create -f -O recordsize=8k SAS_Pool_1 raidz c0t5000C5000568999Bd0 c0t5000C5000568B493d0 c0t5000C50005689AAFd0

      And wrote a file:

      dd if=/dev/zero of=/SAS_Pool_1/test bs=8k count=16384
      The aforementioned script to monitor alignment gives most writes (90%) unagligned (2,3,4,5 blocks)

      I did also try to create the ZFS pool not using disks but partitions (i.e. c0t5000C5000568999Bd0p0), but result does not change: most writes are unaligned.

      I did check ashif and is 9, which is correct

      Why writing a 8Kb block file on a 8Kb stripe ZFS issues unaligned writes to the disk ?
      I undestrand ZFS uses variable block size, but in theaory here written blocks are 8k which is also stripe size (8kb); I did also disable compression and dedup, but result is always strange:

      value ------------- Distribution ------------- count
      -1 | 0
      0 |@@@@@@ 1519
      1 |@@@@@ 1356
      2 |@@@@@@@@@@ 2694
      4 |@@@@@@@@@@@@@@@@@@@ 5117
      8 | 0

      45.091028214 seconds elapsed

      10672 IOs issued
      9140 IOs misaligned, 3 block offset 1701 byte offset
      10030 IOs non-multiple of 8KB
      93 Percent non-8k IOs


      I did try also with mirrors and nothing changes. Other strange string is that 93% is non 8k io. Why ? dd is writing 8k, compression and dedup are off....

      Thanks for any suggestion/clarification

      Edited by: user3484288 on 7-nov-2012 9.14

      Edited by: user3484288 on 7-nov-2012 9.19
        • 1. Re: Solaris ZFS partition/write alignment
          898481
          I did also another test with zvol

          Created a volume with 8k recordsize : zfs create -V20G -o volblocksize=8k SAS_Pool_1/LUN_0

          And then issued the dd to the volume: dd if=/dev/zero of=/dev/zvol/rdsk/SAS_Pool_1/LUN_0 bs=8k count=1024000

          here things get bette, but still odd (50% non 8k, 30% misaligned)

          IO misalignment Count


          value ------------- Distribution ------------- count
          -1 | 0
          0 |@@@@@@@@@@@@@@@@@@@@@@@ 36266
          1 |@@@@ 6760
          2 |@@ 3619
          4 |@@@@@@@@@@ 16325
          8 | 0

          64.200860134 seconds elapsed

          62962 IOs issued
          26681 IOs misaligned, 1 block offset 802 byte offset
          32268 IOs non-multiple of 8KB
          51 Percent non-8k IOs

          Edited by: user3484288 on 7-nov-2012 10.12