3 Replies Latest reply on May 17, 2018 12:10 PM by Scott S.

    reflink and perfomance

    DHelios

      Hi, I have a question about reflink feature that appeared in recent release. As I understand reflink has some relations with Deduplication 2 feature, here is some example:

      root@znstor:/dpool1/shares/fs1# dd if=/dev/urandom of=/dpool1/shares/fs1/file1 bs=10000000 count=1000
      0+1000 records in
      0+1000 records out
      root@znstor:/dpool1/shares/fs1# ls -lh
      total 167617
      -rw-r--r--   1 root     root        127M Mar  2 23:53 file1
      
      
      # look at zpool dedup property
      ...or:/dpool1/shares/fs1# zpool list dpool1
      NAME     SIZE  ALLOC   FREE  CAP  DEDUP  HEALTH  ALTROOT
      dpool1  31.8G   134M  31.6G   0%  1.00x  ONLINE  -
      
      

       

      ...oot@znstor:/dpool1/shares/fs1# for i in {1..1000}; do cp -z file1 file1_${i};done
      
      root@znstor:/dpool1/shares/fs1# zpool list dpool1
      NAME     SIZE  ALLOC   FREE  CAP     DEDUP  HEALTH  ALTROOT
      dpool1  31.8G   269M  31.5G   0%  1001.00x  ONLINE  -
      root@znstor:/dpool1/shares/fs1#
      

       

      Base on previous experience, If I use small record size, for example 8k or somethink like this, we needed about 320 bytes to refer single dedup block. When we lack of memory we meet a performance degradation.

      I'll be appreciate if somebody post link how this futures impact on performance and memory allocation / usage.

        • 1. Re: reflink and perfomance
          Robert Milkowski

          Yes, it does seem to be using DDT.

          The main difference I think is that only writes to the cloned files are affected by DDT, all other writes to a pool (assuming dedup=off) are not, which is a different scenario than when you enable pool-wide dedup. Additionally to write to a file you need its related meta-data anyway, so any performance impact should be rather low.

           

          The added benefit of reflink vs dedup enabled per dataset is that you don't actually have to read all data blocks in order to clone a file (after one clone has been created already though so ddt is populated) nor have to write all the blocks again only to dedup those, so the performance overhead (both I/O and CPU) should be much lower with file level clones than pool level dedup when cloning files.

           

          I'm just guessing here though.

          • 2. Re: reflink and perfomance
            Scott S.

            It would be good if you were to make some tests, in theory with a larger record-size like 1M the DDT ratio (pool wide dedup=on) would be lower as the probability of a block being dedupable is lower, and 64K would give a higher ratio as example. Of course you've also now got meta devices which have also just been added. So this DEDUP field can be misleading now, when you have both these options of file based DDT and pool wide DDT given as only a pool wide metric. Yet same time dedup is set per FS property.

             

            I've been wondering about measuring performance differences between DDT1 vs DDT2 myself.

             

            - "I'm just guessing here though." seems a bit more like, educated guessing.

            • 3. Re: reflink and perfomance
              Scott S.

              Perhaps one way to observe this is monitoring of kstat or kstat2 metrics. Here is what I see which may be related. (I have not enabled any dedup yet in this example)

               

              # kstat2 -p | grep ddt

              kstat:/misc/zfs/arcstats/0;ddt_bufs     0

              kstat:/misc/zfs/arcstats/0;ddt_hits     0

              kstat:/misc/zfs/arcstats/0;ddt_lsize    0

              kstat:/misc/zfs/arcstats/0;ddt_misses   0

              kstat:/misc/zfs/arcstats/0;ddt_raw_size 0

              kstat:/misc/zfs/arcstats/0;ddt_size     0

               

              # kstat2 -p | grep dedup

              kstat:/vm/unix/dedup;bytes_deduped      0

              kstat:/vm/unix/dedup;crtime     199023864401

              kstat:/vm/unix/dedup;dedup_page_failed%5B0%5D   0

              kstat:/vm/unix/dedup;dedup_page_failed%5B1%5D   0

              kstat:/vm/unix/dedup;dedup_page_failed%5B2%5D   0

              kstat:/vm/unix/dedup;dedup_pages_inuse%5B0%5D   0

              kstat:/vm/unix/dedup;dedup_pages_inuse%5B1%5D   0

              kstat:/vm/unix/dedup;dedup_pages_inuse%5B2%5D   0

              kstat:/vm/unix/dedup;dedup_pagesize%5B0%5D      0

              kstat:/vm/unix/dedup;dedup_pagesize%5B1%5D      0

              kstat:/vm/unix/dedup;dedup_pagesize%5B2%5D      0

              kstat:/vm/unix/dedup;hash_found%5B0%5D  0

              kstat:/vm/unix/dedup;hash_found%5B1%5D  0

              kstat:/vm/unix/dedup;hash_found%5B2%5D  0

              kstat:/vm/unix/dedup;hash_misses%5B0%5D 0

              kstat:/vm/unix/dedup;hash_misses%5B1%5D 0

              kstat:/vm/unix/dedup;hash_misses%5B2%5D 0

              kstat:/vm/unix/dedup;pages_hashed%5B0%5D        0

              kstat:/vm/unix/dedup;pages_hashed%5B1%5D        0

              kstat:/vm/unix/dedup;pages_hashed%5B2%5D        0

              kstat:/vm/unix/dedup;snaptime   18587472573621

              kstat:/vm/unix/dedup;vnode_AVL_hits     0

              kstat:/vm/unix/dedup;vnode_AVL_misses   0