This discussion is archived
7 Replies Latest reply: Apr 17, 2013 3:29 PM by 933584 RSS

ZFS Read performance vs write

933584 Newbie
Currently Being Moderated
We noticed an issue with read performance while trying to keep our LTO5 tape fed at the speed they recommend (100mb/s). Our average speed was about 90mb/s and it bounced around a lot. We have 6gb/s 10K drives in a 4 drive raid 0+1. Each drive should be able to sustain well over 90mb/s and when reading from a raid 10 of those drives we expected about 400mb/s.

The odd part is that we can write to the drives at around 370mb/s which is about right.. And writes are normally slower. So I am confused why reads are stuck below 100mb/s.

We watched the iostat of the drives and noticed that when writing a file, all four drives light up to their max write speed apx 160 to 190mb/s. Giving us the combines speeds near around 370mb/s. However when reading a file, iostat only shows one drive at a time being accessed at a slow rate.. Given that we are striped across 2 mirrors, you would think at least two drives would be lit minimum.. Yet it seems like it reads a little from drive 1, then a little from drive 2 ect..

In testing we disabled every attribute we could think of in zfs to increase speed, checksum, log, compression, ect.. None increased read.

This is not a slow machine, Core I7 with 64gigs of ram. In testing on several other solaris machines, all 11.1 boxes exhibited this issue. (I don't have a non 11.1 to test).

Anyone else seen this behavior?

Thanks,
Tom
  • 1. Re: ZFS Read performance vs write
    Nik Expert
    Currently Being Moderated
    Hi.


    show:

    zpool status

    what command use for read ?
    iostat -xnz 2 4 when read data.

    Regards.
  • 2. Re: ZFS Read performance vs write
    933584 Newbie
    Currently Being Moderated
    Here are the reads. I'll post the write speeds after this. This is the backup server we are struggling with to feed the LTO5 at a proper speed. It does not show the "rippled read" pattern of the other servers, but still each drive reads very slow.
    zpool status
      pool: rpool
    state: ONLINE
      scan: resilvered 24.7M in 0h0m with 0 errors on Thu Mar 14 19:13:47 2013
    config:
    
            NAME        STATE     READ WRITE CKSUM
            rpool       ONLINE       0     0     0
              mirror-0  ONLINE       0     0     0
                c8t0d0  ONLINE       0     0     0
                c8t1d0  ONLINE       0     0     0
    
    errors: No known data errors
    
      pool: tank
    state: ONLINE
      scan: none requested
    config:
    
            NAME        STATE     READ WRITE CKSUM
            tank        ONLINE       0     0     0
              mirror-0  ONLINE       0     0     0
                c8t2d0  ONLINE       0     0     0
                c8t3d0  ONLINE       0     0     0
              mirror-1  ONLINE       0     0     0
                c8t4d0  ONLINE       0     0     0
                c8t5d0  ONLINE       0     0     0
    reading command of a 424GB file
    dd if=/tank/backup/scratch/backup/rsync\@zfs-auto-snap_weekly-2013-04-13-2302.gz of=/dev/null
    dd above completes at about 105MB/s

    dd straight from one drive is 180MB/s

    sample of iostat -Mx sd1 sd2 sd3 sd4 sd5 1 pretty much ~25-30MB/s per drive during the dd
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2      39.0    0.0   24.0    0.0  0.0  1.0   26.9   0  24
    sd3      46.0    0.0   24.0    0.0  0.0  0.8   17.4   0  19
    sd4      39.0    0.0   23.9    0.0  0.0  1.3   34.5   0  30
    sd5      47.0    0.0   24.8    0.0  0.0  1.5   32.1   0  32
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2      62.0    0.0   33.4    0.0  0.0  1.4   22.7   0  33
    sd3      62.0    0.0   30.8    0.0  0.0  1.4   22.3   0  30
    sd4      54.0    0.0   31.2    0.0  0.0  1.8   33.1   0  39
    sd5      54.0    0.0   32.1    0.0  0.0  1.6   30.0   0  39
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2      47.0    0.0   24.0    0.0  0.0  0.8   17.4   0  23
    sd3      45.0    0.0   24.1    0.0  0.0  1.2   26.3   0  26
    sd4      50.0    0.0   26.4    0.0  0.0  1.6   31.5   0  34
    sd5      48.0    0.0   22.9    0.0  0.0  1.1   23.5   0  30
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2      59.0    0.0   32.0    0.0  0.0  1.2   20.7   0  31
    sd3      58.0    0.0   32.6    0.0  0.0  1.3   22.3   0  31
    sd4      52.0    0.0   31.9    0.0  0.0  1.4   27.8   0  37
    sd5      65.0    0.0   32.1    0.0  0.0  2.0   31.5   0  43
    
    zfs get all tank/backup/scratch
    NAME                 PROPERTY              VALUE                  SOURCE
    tank/backup/scratch  aclinherit            restricted             default
    tank/backup/scratch  aclmode               discard                default
    tank/backup/scratch  atime                 on                     default
    tank/backup/scratch  available             97.8G                  -
    tank/backup/scratch  canmount              on                     default
    tank/backup/scratch  casesensitivity       mixed                  -
    tank/backup/scratch  checksum              on                     default
    tank/backup/scratch  compression           off                    local
    tank/backup/scratch  compressratio         1.00x                  -
    tank/backup/scratch  copies                1                      default
    tank/backup/scratch  creation              Tue Apr  9 14:26 2013  -
    tank/backup/scratch  dedup                 off                    default
    tank/backup/scratch  devices               on                     default
    tank/backup/scratch  encryption            off                    -
    tank/backup/scratch  exec                  on                     default
    tank/backup/scratch  keychangedate         -                      default
    tank/backup/scratch  keysource             none                   default
    tank/backup/scratch  keystatus             none                   -
    tank/backup/scratch  logbias               throughput             inherited from tank
    tank/backup/scratch  mlslabel              none                   -
    tank/backup/scratch  mounted               yes                    -
    tank/backup/scratch  mountpoint            /tank/backup/scratch   default
    tank/backup/scratch  multilevel            off                    -
    tank/backup/scratch  nbmand                off                    default
    tank/backup/scratch  normalization         none                   -
    tank/backup/scratch  primarycache          all                    default
    tank/backup/scratch  quota                 none                   default
    tank/backup/scratch  readonly              off                    inherited from tank
    tank/backup/scratch  recordsize            128K                   default
    tank/backup/scratch  referenced            1.03T                  -
    tank/backup/scratch  refquota              none                   default
    tank/backup/scratch  refreservation        none                   default
    tank/backup/scratch  rekeydate             -                      default
    tank/backup/scratch  reservation           none                   default
    tank/backup/scratch  rstchown              on                     default
    tank/backup/scratch  secondarycache        all                    default
    tank/backup/scratch  setuid                on                     default
    tank/backup/scratch  shadow                none                   -
    tank/backup/scratch  share.*               ...                    default
    tank/backup/scratch  snapdir               hidden                 default
    tank/backup/scratch  sync                  standard               default
    tank/backup/scratch  type                  filesystem             -
    tank/backup/scratch  used                  1.03T                  -
    tank/backup/scratch  usedbychildren        0                      -
    tank/backup/scratch  usedbydataset         1.03T                  -
    tank/backup/scratch  usedbyrefreservation  0                      -
    tank/backup/scratch  usedbysnapshots       0                      -
    tank/backup/scratch  utf8only              off                    -
    tank/backup/scratch  version               6                      -
    tank/backup/scratch  vscan                 off                    default
    tank/backup/scratch  xattr                 on                     default
    tank/backup/scratch  zoned                 off                    default
    Edited by: TomS on Apr 17, 2013 11:14 AM
  • 3. Re: ZFS Read performance vs writes
    Nik Expert
    Currently Being Moderated
    Hi.

    By default dd use block size = 512 byte. It's very small and slow.
    Try add parameter to dd bs=128k

    dd if=/tank/backup/scratch/backup/rsync\@zfs-auto-snap_weekly-2013-04-13-2302.gz of=/dev/null bs=128k


    And show results.

    Regards.
  • 4. Re: ZFS Read performance vs writes
    933584 Newbie
    Currently Being Moderated
    Thanks. it seems to have improved it a little, but its still less than the write speed and half the actual read potential of the drives. DD is mirroring what we see speed wise when tar-ing to the tape.

    I'll try some more real world tests like tarrballing to /dev/null and timing it.

    Here is the higher block size.
    dd if=/tank/backup/scratch/backup/rsync\@zfs-auto-snap_weekly-2013-04-13-2302.gz of=/dev/null bs=128k
    
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2     147.9    0.0   72.1    0.0  2.5  0.9   22.8  82  88
    sd3     106.9    0.0   69.9    0.0  4.4  1.0   50.4  99  99
    sd4     146.9    0.0   72.6    0.0  2.8  0.9   25.1  87  93
    sd5     168.9    0.0   70.1    0.0  3.5  0.9   26.0  87  93
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2     133.1    0.0   67.8    0.0  2.2  0.9   23.4  79  91
    sd3     163.1    0.0   71.7    0.0  2.2  0.9   19.0  79  86
    sd4     162.1    0.0   68.2    0.0  2.7  0.9   22.3  83  89
    sd5     155.1    0.0   66.5    0.0  2.8  0.9   23.9  85  94
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2     150.0    0.0   60.2    0.0  2.2  0.8   19.9  74  83
    sd3     136.0    0.0   59.7    0.0  1.8  0.8   19.3  68  78
    sd4     152.0    0.0   61.4    0.0  2.1  0.8   19.1  72  79
    sd5     145.0    0.0   62.5    0.0  3.3  1.0   29.7  95  99
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2     146.0    0.0   69.5    0.0  2.1  0.9   20.6  76  88
    sd3     154.0    0.0   68.8    0.0  2.9  0.9   24.9  88  92
    sd4     151.0    0.0   69.8    0.0  2.6  0.9   23.2  84  90
    sd5     165.0    0.0   69.6    0.0  3.2  0.9   24.9  88  92
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2     154.0    0.0   70.2    0.0  2.3  0.9   20.3  78  86
    sd3     156.0    0.0   70.9    0.0  2.2  0.8   19.2  79  82
    sd4     131.0    0.0   70.1    0.0  2.0  0.9   21.9  75  86
    sd5     156.0    0.0   70.2    0.0  3.4  1.0   27.9  95  96
    Here is the write performance to the array. We had two concurrent DDs reading from each individual drive in rpool and dumping to files in tank.

    As you can see, write performance is much higher.
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2       0.0  152.0    0.0  115.6  9.0  1.0   65.7 100  99
    sd3       0.0  155.0    0.0  119.3  9.0  1.0   64.4 100  99
    sd4       0.0  146.0    0.0  146.0  9.0  1.0   68.4 100  99
    sd5       0.0  153.0    0.0  153.0  9.0  1.0   65.3 100  99
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2       3.0  146.0    0.0   82.3  7.2  0.9   54.2  90  90
    sd3       6.0  142.0    0.0   82.3  6.1  1.8   53.2  67  91
    sd4       6.0  101.0    0.0   61.8  3.7  0.4   38.6  44  45
    sd5      10.0  105.0    0.1   61.6  3.6  0.5   35.0  47  46
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2       1.0   96.9    0.0   72.6  4.1  0.5   46.5  50  46
    sd3       2.0   92.9    0.0   69.6  4.4  0.5   51.2  89  47
    sd4       1.0   76.9    0.0   63.9  4.0  0.4   57.2  45  44
    sd5       5.0   78.9    0.0   66.2  4.0  0.4   53.6  48  45
                     extended device statistics
    device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b
    sd2       0.0  125.2    0.0  108.1  9.0  1.0   79.8 100  99
    sd3       0.0  134.2    0.0  116.6  9.0  1.0   74.4 100  99
    sd4       0.0  179.3    0.0  159.2  9.0  1.0   55.7 100  99
    sd5       0.0  179.3    0.0  158.2  9.0  1.0   55.7 100  99
    Edited by: TomS on Apr 17, 2013 12:23 PM
  • 5. Re: ZFS Read performance vs writes
    Nik Expert
    Currently Being Moderated
    Hi.

    According provided information, pool performance for read and write look like same.

    For write: ~ 300 MB/s ( 150 *2 ) two other disk write same information
    For read ~ 280 MB/s ( 70 *4 )


    You also can use
    zpool iostat 5

    for write - ZFS use continues block ( serial record).
    In case you use incremental snapshot - result may cause random read and decrease performance.

    Regards.
  • 6. Re: ZFS Read performance vs writes
    933584 Newbie
    Currently Being Moderated
    The write was being fed by two drives, so it was probably not running at its full potential. Here is a filebench results again showing reads being easily half the speed of writes.

    note: the columns don't line up well, the last 4 columns are read speeds.
    Filebench Version 1.4.9.1
    12917: 0.000: Allocated 126MB of shared memory
    filebench> load singlestreamread
    12917: 15.861: Single Stream Read Version 3.0 personality successfully loaded
    12917: 15.861: Usage: set $dir=<dir>
    12917: 15.861:        set $filesize=<size>    defaults to 5368709120
    12917: 15.861:        set $nthreads=<value>   defaults to 1
    12917: 15.861:        set $iosize=<value> defaults to 1048576
    12917: 15.861:        run runtime (e.g. run 60)
    12917: 15.861: This workload needs 5368709120 of disk space by default
    
    sd2 Mw/s  sd3 Mw/s  sd4 Mw/s  sd5 Mw/s  sd2 Mr/s  sd3 Mr/s  sd4 Mr/s  sd5 Mr/s
    0     0     0     0     0     0     0     0
    23.6     0     0     0     0     0     0     0
    132.7     58.2     57.8     61.8     0     0     0     0
    148.7     118.2     118.6     114.6     0     0     0     0
    91.7     137     144.5     130.1     0     0     0     0
    132.7     83.2     73.6     87     0     0     0     0
    108.8     129     178     157     0     0     0     0
    55.6     113.4     95.4     117.4     0     0     0     0
    61     55.1     25.3     22.7     0     0.3     0.2     0.2
    76.3     62.1     186.5     183.8     0     0     0     0
    1     76.2     50.9     50.6     0     0     0     0
    0     0     81.5     89.8     0     0     0     0
    2.3     0     53.3     50.8     0.1     0     0     0
    119.8     2.3     6.8     8.8     0     0.1     0     0
    107.6     117.7     144.1     142.6     0     0     0     0
    47.3     97.9     123.1     124.3     0     0     0     0
    136.5     59.1     7     7.4     0     0     0     0.1
    95.4     104.8     92.7     92.3     0     0     0     0
    5.7     105.9     66.4     64.8     0     0     0     0
    23.4     27     78.7     78.2     0     0     0     0
    119.1     20.5     24.1     24.7     0     0     0     0
    96.9     93.5     129.2     123.5     0     0     0     0
    53.8     122.6     101.6     97.9     0     0     0     0
    143.7     56.5     39     50.2     0     0     0     0
    105     143.1     146.7     142.1     0     0     0     0
    106.9     101.9     110.7     107     0     0     0     0
    66.2     107.9     105.9     104.9     0     0     0     0
    0.6     69     58     67.3     0     0     0     0
    0.2     0.6     0.4     0.4     0     0     0     0
    0     0.2     0.5     0.5     0     0     0     0
    0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0
    0     0     0     0     60.1     0     0     0
    0     0     0     0     96.2     60.9     58.1     59.8
    0     0     0     0     73.3     88.6     98.4     94.9
    0     0     0     0     71.1     70.8     72.1     72.3
    0     0     0     0     42.2     64.1     68.3     66.8
    0.1     0     0     0     31.3     54.5     54.9     51.5
    0.4     0.1     0     0     3.4     29.8     56.8     62.9
    0     0.4     0.8     0.8     60.9     4.9     22.5     22.2
    0     0     0     0     71.7     56.7     46.4     51.4
    0     0     0     0     68.8     56.6     88.9     87.5
    0     0     0     0     54     45.1     90     90.4
    0     0     0     0     40.1     44.5     49.4     46.8
    0     0     0     0     31.3     41.5     42     42.5
    0     0     0     0     20.1     34.8     23.1     22.8
    0     0     0     0     16.3     21.7     8.7     10.1
    0     0     0     0     28.1     19     13     9.9
    0     0     0     0     66.6     22.4     11.6     16.1
    0     0     0     0     48.9     70.9     60.6     67.6
    0     0     0     0     20.6     44     41     39.4
    0     0     0     0     18     19.1     21.1     21.1
    0     0     0     0     35.6     19.2     19.6     21.1
    0     0     0     0     59.7     34.4     37.3     39.3
    0     0     0     0     77.6     58.6     63.4     59.6
    0     0     0     0     65.1     77.9     73.5     77.9
    0     0     0     0     52     65.6     58.6     72
    0     0     0     0     28.7     51.9     51.4     47.9
    0     0     0     0     53.4     28.7     29.7     33.8
    0     0     0     0     77.2     53.9     50.4     54.8
    0     0     0     0     48.3     75.3     74.8     76.7
    0     0     0     0     33.5     52     57.3     50.2
    0     0     0     0     42.5     30.5     30.6     30.6
    0     0     0     0     61.6     35.8     41.4     34.8
    0     0     0     0     66.9     62.1     71.9     67.3
    0     0     0     0     57.5     77.4     56.8     56.2
    0     0     0     0     42.5     63.6     49.5     36.7
    0     0     0     0     47.9     55.4     61.3     65.6
    0     0     0     0     41     42.5     60.5     56.4
    0     0     0     0     23.9     25.2     47     48.4
    0     0     0     0     22.4     17     32.6     27.9
    0     0     0     0     69     17.6     20.4     27.1
    0     0     0     0     34     60.9     55.4     51.4
    0     0     0     0     22.3     26.8     25.1     23.8
    0     0     0     0     46.9     24.2     20.8     19.4
    0     0     0     0     52.9     43.7     38     40.1
    0     0     0     0     69.7     61.7     75.6     60.2
    0     0     0     0     45.1     76.8     49.3     47.5
    0     0     0     0     51.7     46.6     29.7     30.8
    0     0     0     0     44.5     54     24.6     22.6
    0     0     0     0     26     50.9     40.8     44.3
    0     0     0     0     62.1     34.8     34.8     36.9
    0     0     0     0     70.9     55     55.7     55.7
    0     0     0     0     57.1     56     67     57.4
    0     0     0     0     69.4     42.5     70.5     77.3
    0     0     0     0     48.9     37.1     81.3     78.6
    0     0     0     0     31.9     44.2     65.9     59.7
    0     0     0     0     25.1     47.9     47.1     54
    0     0     0     0     36.3     30.6     38.1     33
    0     0     0     0     63.7     48.9     29.9     47.2
    0     0     0     0     36.3     50.5     54.1     52.2
    0     0     0     0     48.6     34.9     26.4     28.8
    0     0     0     0     18.8     45.7     46.5     53.7
    0     0     0     0     0.5     13.5     12.7     21.8
    0     0     0     0     1.1     0.5     0.4     0.6
    0     0     0     0     0.8     1.2     0.8     1.3
    0     0     0     0     0.7     0.6     0.5     0.6
    0     0     0     0     0.9     0.6     0.6     0.7
    0     0     0     0     0.4     0.7     0.7     0.8
    0     0     0     0     0.2     0.4     0.5     0.5
    0     0     0     0     0.1     0.2     0.3     0.2
    0     0     0     0     0     0.1     0.2     0.1
    0     0     0     0     0     0     0     0
    0     0     0     0     0.4     0     0     0
         0     0     0          0.1     0.5     0.5
    Edited by: TomS on Apr 17, 2013 2:49 PM
  • 7. Re: ZFS Read performance vs writes
    933584 Newbie
    Currently Being Moderated
    Ahh is see what you are getting at.. Yeah I guess my colleages and I are just surprised that performance for a stripped mirror is only slightly better than a single drive.

    Thanks for taking the time to look over the data!

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points