7 Replies Latest reply: Apr 17, 2013 5:29 PM by 933584 RSS

    ZFS Read performance vs write

    933584
      We noticed an issue with read performance while trying to keep our LTO5 tape fed at the speed they recommend (100mb/s). Our average speed was about 90mb/s and it bounced around a lot. We have 6gb/s 10K drives in a 4 drive raid 0+1. Each drive should be able to sustain well over 90mb/s and when reading from a raid 10 of those drives we expected about 400mb/s.

      The odd part is that we can write to the drives at around 370mb/s which is about right.. And writes are normally slower. So I am confused why reads are stuck below 100mb/s.

      We watched the iostat of the drives and noticed that when writing a file, all four drives light up to their max write speed apx 160 to 190mb/s. Giving us the combines speeds near around 370mb/s. However when reading a file, iostat only shows one drive at a time being accessed at a slow rate.. Given that we are striped across 2 mirrors, you would think at least two drives would be lit minimum.. Yet it seems like it reads a little from drive 1, then a little from drive 2 ect..

      In testing we disabled every attribute we could think of in zfs to increase speed, checksum, log, compression, ect.. None increased read.

      This is not a slow machine, Core I7 with 64gigs of ram. In testing on several other solaris machines, all 11.1 boxes exhibited this issue. (I don't have a non 11.1 to test).

      Anyone else seen this behavior?

      Thanks,
      Tom
        • 1. Re: ZFS Read performance vs write
          Nik
          Hi.


          show:

          zpool status

          what command use for read ?
          iostat -xnz 2 4 when read data.

          Regards.
          • 2. Re: ZFS Read performance vs write
            933584
            Here are the reads. I'll post the write speeds after this. This is the backup server we are struggling with to feed the LTO5 at a proper speed. It does not show the "rippled read" pattern of the other servers, but still each drive reads very slow.
            zpool status
              pool: rpool
            state: ONLINE
              scan: resilvered 24.7M in 0h0m with 0 errors on Thu Mar 14 19:13:47 2013
            config:
            
                    NAME        STATE     READ WRITE CKSUM
                    rpool       ONLINE       0     0     0
                      mirror-0  ONLINE       0     0     0
                        c8t0d0  ONLINE       0     0     0
                        c8t1d0  ONLINE       0     0     0
            
            errors: No known data errors
            
              pool: tank
            state: ONLINE
              scan: none requested
            config:
            
                    NAME        STATE     READ WRITE CKSUM
                    tank        ONLINE       0     0     0
                      mirror-0  ONLINE       0     0     0
                        c8t2d0  ONLINE       0     0     0
                        c8t3d0  ONLINE       0     0     0
                      mirror-1  ONLINE       0     0     0
                        c8t4d0  ONLINE       0     0     0
                        c8t5d0  ONLINE       0     0     0
            reading command of a 424GB file
            dd if=/tank/backup/scratch/backup/rsync\@zfs-auto-snap_weekly-2013-04-13-2302.gz of=/dev/null
            dd above completes at about 105MB/s

            dd straight from one drive is 180MB/s

            sample of iostat -Mx sd1 sd2 sd3 sd4 sd5 1 pretty much ~25-30MB/s per drive during the dd
                             extended device statistics
            device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
            sd2      39.0    0.0   24.0    0.0  0.0  1.0   26.9   0  24
            sd3      46.0    0.0   24.0    0.0  0.0  0.8   17.4   0  19
            sd4      39.0    0.0   23.9    0.0  0.0  1.3   34.5   0  30
            sd5      47.0    0.0   24.8    0.0  0.0  1.5   32.1   0  32
                             extended device statistics
            device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
            sd2      62.0    0.0   33.4    0.0  0.0  1.4   22.7   0  33
            sd3      62.0    0.0   30.8    0.0  0.0  1.4   22.3   0  30
            sd4      54.0    0.0   31.2    0.0  0.0  1.8   33.1   0  39
            sd5      54.0    0.0   32.1    0.0  0.0  1.6   30.0   0  39
                             extended device statistics
            device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
            sd2      47.0    0.0   24.0    0.0  0.0  0.8   17.4   0  23
            sd3      45.0    0.0   24.1    0.0  0.0  1.2   26.3   0  26
            sd4      50.0    0.0   26.4    0.0  0.0  1.6   31.5   0  34
            sd5      48.0    0.0   22.9    0.0  0.0  1.1   23.5   0  30
                             extended device statistics
            device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
            sd2      59.0    0.0   32.0    0.0  0.0  1.2   20.7   0  31
            sd3      58.0    0.0   32.6    0.0  0.0  1.3   22.3   0  31
            sd4      52.0    0.0   31.9    0.0  0.0  1.4   27.8   0  37
            sd5      65.0    0.0   32.1    0.0  0.0  2.0   31.5   0  43
            
            zfs get all tank/backup/scratch
            NAME                 PROPERTY              VALUE                  SOURCE
            tank/backup/scratch  aclinherit            restricted             default
            tank/backup/scratch  aclmode               discard                default
            tank/backup/scratch  atime                 on                     default
            tank/backup/scratch  available             97.8G                  -
            tank/backup/scratch  canmount              on                     default
            tank/backup/scratch  casesensitivity       mixed                  -
            tank/backup/scratch  checksum              on                     default
            tank/backup/scratch  compression           off                    local
            tank/backup/scratch  compressratio         1.00x                  -
            tank/backup/scratch  copies                1                      default
            tank/backup/scratch  creation              Tue Apr  9 14:26 2013  -
            tank/backup/scratch  dedup                 off                    default
            tank/backup/scratch  devices               on                     default
            tank/backup/scratch  encryption            off                    -
            tank/backup/scratch  exec                  on                     default
            tank/backup/scratch  keychangedate         -                      default
            tank/backup/scratch  keysource             none                   default
            tank/backup/scratch  keystatus             none                   -
            tank/backup/scratch  logbias               throughput             inherited from tank
            tank/backup/scratch  mlslabel              none                   -
            tank/backup/scratch  mounted               yes                    -
            tank/backup/scratch  mountpoint            /tank/backup/scratch   default
            tank/backup/scratch  multilevel            off                    -
            tank/backup/scratch  nbmand                off                    default
            tank/backup/scratch  normalization         none                   -
            tank/backup/scratch  primarycache          all                    default
            tank/backup/scratch  quota                 none                   default
            tank/backup/scratch  readonly              off                    inherited from tank
            tank/backup/scratch  recordsize            128K                   default
            tank/backup/scratch  referenced            1.03T                  -
            tank/backup/scratch  refquota              none                   default
            tank/backup/scratch  refreservation        none                   default
            tank/backup/scratch  rekeydate             -                      default
            tank/backup/scratch  reservation           none                   default
            tank/backup/scratch  rstchown              on                     default
            tank/backup/scratch  secondarycache        all                    default
            tank/backup/scratch  setuid                on                     default
            tank/backup/scratch  shadow                none                   -
            tank/backup/scratch  share.*               ...                    default
            tank/backup/scratch  snapdir               hidden                 default
            tank/backup/scratch  sync                  standard               default
            tank/backup/scratch  type                  filesystem             -
            tank/backup/scratch  used                  1.03T                  -
            tank/backup/scratch  usedbychildren        0                      -
            tank/backup/scratch  usedbydataset         1.03T                  -
            tank/backup/scratch  usedbyrefreservation  0                      -
            tank/backup/scratch  usedbysnapshots       0                      -
            tank/backup/scratch  utf8only              off                    -
            tank/backup/scratch  version               6                      -
            tank/backup/scratch  vscan                 off                    default
            tank/backup/scratch  xattr                 on                     default
            tank/backup/scratch  zoned                 off                    default
            Edited by: TomS on Apr 17, 2013 11:14 AM
            • 3. Re: ZFS Read performance vs writes
              Nik
              Hi.

              By default dd use block size = 512 byte. It's very small and slow.
              Try add parameter to dd bs=128k

              dd if=/tank/backup/scratch/backup/rsync\@zfs-auto-snap_weekly-2013-04-13-2302.gz of=/dev/null bs=128k


              And show results.

              Regards.
              • 4. Re: ZFS Read performance vs writes
                933584
                Thanks. it seems to have improved it a little, but its still less than the write speed and half the actual read potential of the drives. DD is mirroring what we see speed wise when tar-ing to the tape.

                I'll try some more real world tests like tarrballing to /dev/null and timing it.

                Here is the higher block size.
                dd if=/tank/backup/scratch/backup/rsync\@zfs-auto-snap_weekly-2013-04-13-2302.gz of=/dev/null bs=128k
                
                device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
                sd2     147.9    0.0   72.1    0.0  2.5  0.9   22.8  82  88
                sd3     106.9    0.0   69.9    0.0  4.4  1.0   50.4  99  99
                sd4     146.9    0.0   72.6    0.0  2.8  0.9   25.1  87  93
                sd5     168.9    0.0   70.1    0.0  3.5  0.9   26.0  87  93
                                 extended device statistics
                device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
                sd2     133.1    0.0   67.8    0.0  2.2  0.9   23.4  79  91
                sd3     163.1    0.0   71.7    0.0  2.2  0.9   19.0  79  86
                sd4     162.1    0.0   68.2    0.0  2.7  0.9   22.3  83  89
                sd5     155.1    0.0   66.5    0.0  2.8  0.9   23.9  85  94
                                 extended device statistics
                device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
                sd2     150.0    0.0   60.2    0.0  2.2  0.8   19.9  74  83
                sd3     136.0    0.0   59.7    0.0  1.8  0.8   19.3  68  78
                sd4     152.0    0.0   61.4    0.0  2.1  0.8   19.1  72  79
                sd5     145.0    0.0   62.5    0.0  3.3  1.0   29.7  95  99
                                 extended device statistics
                device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
                sd2     146.0    0.0   69.5    0.0  2.1  0.9   20.6  76  88
                sd3     154.0    0.0   68.8    0.0  2.9  0.9   24.9  88  92
                sd4     151.0    0.0   69.8    0.0  2.6  0.9   23.2  84  90
                sd5     165.0    0.0   69.6    0.0  3.2  0.9   24.9  88  92
                                 extended device statistics
                device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
                sd2     154.0    0.0   70.2    0.0  2.3  0.9   20.3  78  86
                sd3     156.0    0.0   70.9    0.0  2.2  0.8   19.2  79  82
                sd4     131.0    0.0   70.1    0.0  2.0  0.9   21.9  75  86
                sd5     156.0    0.0   70.2    0.0  3.4  1.0   27.9  95  96
                Here is the write performance to the array. We had two concurrent DDs reading from each individual drive in rpool and dumping to files in tank.

                As you can see, write performance is much higher.
                                 extended device statistics
                device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
                sd2       0.0  152.0    0.0  115.6  9.0  1.0   65.7 100  99
                sd3       0.0  155.0    0.0  119.3  9.0  1.0   64.4 100  99
                sd4       0.0  146.0    0.0  146.0  9.0  1.0   68.4 100  99
                sd5       0.0  153.0    0.0  153.0  9.0  1.0   65.3 100  99
                                 extended device statistics
                device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
                sd2       3.0  146.0    0.0   82.3  7.2  0.9   54.2  90  90
                sd3       6.0  142.0    0.0   82.3  6.1  1.8   53.2  67  91
                sd4       6.0  101.0    0.0   61.8  3.7  0.4   38.6  44  45
                sd5      10.0  105.0    0.1   61.6  3.6  0.5   35.0  47  46
                                 extended device statistics
                device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
                sd2       1.0   96.9    0.0   72.6  4.1  0.5   46.5  50  46
                sd3       2.0   92.9    0.0   69.6  4.4  0.5   51.2  89  47
                sd4       1.0   76.9    0.0   63.9  4.0  0.4   57.2  45  44
                sd5       5.0   78.9    0.0   66.2  4.0  0.4   53.6  48  45
                                 extended device statistics
                device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
                sd2       0.0  125.2    0.0  108.1  9.0  1.0   79.8 100  99
                sd3       0.0  134.2    0.0  116.6  9.0  1.0   74.4 100  99
                sd4       0.0  179.3    0.0  159.2  9.0  1.0   55.7 100  99
                sd5       0.0  179.3    0.0  158.2  9.0  1.0   55.7 100  99
                Edited by: TomS on Apr 17, 2013 12:23 PM
                • 5. Re: ZFS Read performance vs writes
                  Nik
                  Hi.

                  According provided information, pool performance for read and write look like same.

                  For write: ~ 300 MB/s ( 150 *2 ) two other disk write same information
                  For read ~ 280 MB/s ( 70 *4 )


                  You also can use
                  zpool iostat 5

                  for write - ZFS use continues block ( serial record).
                  In case you use incremental snapshot - result may cause random read and decrease performance.

                  Regards.
                  • 6. Re: ZFS Read performance vs writes
                    933584
                    The write was being fed by two drives, so it was probably not running at its full potential. Here is a filebench results again showing reads being easily half the speed of writes.

                    note: the columns don't line up well, the last 4 columns are read speeds.
                    Filebench Version 1.4.9.1
                    12917: 0.000: Allocated 126MB of shared memory
                    filebench> load singlestreamread
                    12917: 15.861: Single Stream Read Version 3.0 personality successfully loaded
                    12917: 15.861: Usage: set $dir=<dir>
                    12917: 15.861:        set $filesize=<size>    defaults to 5368709120
                    12917: 15.861:        set $nthreads=<value>   defaults to 1
                    12917: 15.861:        set $iosize=<value> defaults to 1048576
                    12917: 15.861:        run runtime (e.g. run 60)
                    12917: 15.861: This workload needs 5368709120 of disk space by default
                    
                    sd2 Mw/s  sd3 Mw/s  sd4 Mw/s  sd5 Mw/s  sd2 Mr/s  sd3 Mr/s  sd4 Mr/s  sd5 Mr/s
                    0     0     0     0     0     0     0     0
                    23.6     0     0     0     0     0     0     0
                    132.7     58.2     57.8     61.8     0     0     0     0
                    148.7     118.2     118.6     114.6     0     0     0     0
                    91.7     137     144.5     130.1     0     0     0     0
                    132.7     83.2     73.6     87     0     0     0     0
                    108.8     129     178     157     0     0     0     0
                    55.6     113.4     95.4     117.4     0     0     0     0
                    61     55.1     25.3     22.7     0     0.3     0.2     0.2
                    76.3     62.1     186.5     183.8     0     0     0     0
                    1     76.2     50.9     50.6     0     0     0     0
                    0     0     81.5     89.8     0     0     0     0
                    2.3     0     53.3     50.8     0.1     0     0     0
                    119.8     2.3     6.8     8.8     0     0.1     0     0
                    107.6     117.7     144.1     142.6     0     0     0     0
                    47.3     97.9     123.1     124.3     0     0     0     0
                    136.5     59.1     7     7.4     0     0     0     0.1
                    95.4     104.8     92.7     92.3     0     0     0     0
                    5.7     105.9     66.4     64.8     0     0     0     0
                    23.4     27     78.7     78.2     0     0     0     0
                    119.1     20.5     24.1     24.7     0     0     0     0
                    96.9     93.5     129.2     123.5     0     0     0     0
                    53.8     122.6     101.6     97.9     0     0     0     0
                    143.7     56.5     39     50.2     0     0     0     0
                    105     143.1     146.7     142.1     0     0     0     0
                    106.9     101.9     110.7     107     0     0     0     0
                    66.2     107.9     105.9     104.9     0     0     0     0
                    0.6     69     58     67.3     0     0     0     0
                    0.2     0.6     0.4     0.4     0     0     0     0
                    0     0.2     0.5     0.5     0     0     0     0
                    0     0     0     0     0     0     0     0
                    0     0     0     0     0     0     0     0
                    0     0     0     0     60.1     0     0     0
                    0     0     0     0     96.2     60.9     58.1     59.8
                    0     0     0     0     73.3     88.6     98.4     94.9
                    0     0     0     0     71.1     70.8     72.1     72.3
                    0     0     0     0     42.2     64.1     68.3     66.8
                    0.1     0     0     0     31.3     54.5     54.9     51.5
                    0.4     0.1     0     0     3.4     29.8     56.8     62.9
                    0     0.4     0.8     0.8     60.9     4.9     22.5     22.2
                    0     0     0     0     71.7     56.7     46.4     51.4
                    0     0     0     0     68.8     56.6     88.9     87.5
                    0     0     0     0     54     45.1     90     90.4
                    0     0     0     0     40.1     44.5     49.4     46.8
                    0     0     0     0     31.3     41.5     42     42.5
                    0     0     0     0     20.1     34.8     23.1     22.8
                    0     0     0     0     16.3     21.7     8.7     10.1
                    0     0     0     0     28.1     19     13     9.9
                    0     0     0     0     66.6     22.4     11.6     16.1
                    0     0     0     0     48.9     70.9     60.6     67.6
                    0     0     0     0     20.6     44     41     39.4
                    0     0     0     0     18     19.1     21.1     21.1
                    0     0     0     0     35.6     19.2     19.6     21.1
                    0     0     0     0     59.7     34.4     37.3     39.3
                    0     0     0     0     77.6     58.6     63.4     59.6
                    0     0     0     0     65.1     77.9     73.5     77.9
                    0     0     0     0     52     65.6     58.6     72
                    0     0     0     0     28.7     51.9     51.4     47.9
                    0     0     0     0     53.4     28.7     29.7     33.8
                    0     0     0     0     77.2     53.9     50.4     54.8
                    0     0     0     0     48.3     75.3     74.8     76.7
                    0     0     0     0     33.5     52     57.3     50.2
                    0     0     0     0     42.5     30.5     30.6     30.6
                    0     0     0     0     61.6     35.8     41.4     34.8
                    0     0     0     0     66.9     62.1     71.9     67.3
                    0     0     0     0     57.5     77.4     56.8     56.2
                    0     0     0     0     42.5     63.6     49.5     36.7
                    0     0     0     0     47.9     55.4     61.3     65.6
                    0     0     0     0     41     42.5     60.5     56.4
                    0     0     0     0     23.9     25.2     47     48.4
                    0     0     0     0     22.4     17     32.6     27.9
                    0     0     0     0     69     17.6     20.4     27.1
                    0     0     0     0     34     60.9     55.4     51.4
                    0     0     0     0     22.3     26.8     25.1     23.8
                    0     0     0     0     46.9     24.2     20.8     19.4
                    0     0     0     0     52.9     43.7     38     40.1
                    0     0     0     0     69.7     61.7     75.6     60.2
                    0     0     0     0     45.1     76.8     49.3     47.5
                    0     0     0     0     51.7     46.6     29.7     30.8
                    0     0     0     0     44.5     54     24.6     22.6
                    0     0     0     0     26     50.9     40.8     44.3
                    0     0     0     0     62.1     34.8     34.8     36.9
                    0     0     0     0     70.9     55     55.7     55.7
                    0     0     0     0     57.1     56     67     57.4
                    0     0     0     0     69.4     42.5     70.5     77.3
                    0     0     0     0     48.9     37.1     81.3     78.6
                    0     0     0     0     31.9     44.2     65.9     59.7
                    0     0     0     0     25.1     47.9     47.1     54
                    0     0     0     0     36.3     30.6     38.1     33
                    0     0     0     0     63.7     48.9     29.9     47.2
                    0     0     0     0     36.3     50.5     54.1     52.2
                    0     0     0     0     48.6     34.9     26.4     28.8
                    0     0     0     0     18.8     45.7     46.5     53.7
                    0     0     0     0     0.5     13.5     12.7     21.8
                    0     0     0     0     1.1     0.5     0.4     0.6
                    0     0     0     0     0.8     1.2     0.8     1.3
                    0     0     0     0     0.7     0.6     0.5     0.6
                    0     0     0     0     0.9     0.6     0.6     0.7
                    0     0     0     0     0.4     0.7     0.7     0.8
                    0     0     0     0     0.2     0.4     0.5     0.5
                    0     0     0     0     0.1     0.2     0.3     0.2
                    0     0     0     0     0     0.1     0.2     0.1
                    0     0     0     0     0     0     0     0
                    0     0     0     0     0.4     0     0     0
                         0     0     0          0.1     0.5     0.5
                    Edited by: TomS on Apr 17, 2013 2:49 PM
                    • 7. Re: ZFS Read performance vs writes
                      933584
                      Ahh is see what you are getting at.. Yeah I guess my colleages and I are just surprised that performance for a stripped mirror is only slightly better than a single drive.

                      Thanks for taking the time to look over the data!