This discussion is archived
10 Replies Latest reply: Dec 11, 2012 7:18 AM by KirkMcNeilFJ RSS

How to count number of files on zfs filesystem

KirkMcNeilFJ Newbie
Currently Being Moderated
Hi all,
Is there a way to count the number of files on a zfs filesystem similar to how "df -o i /ufs_filesystm" works? I am looking for a way to do this without using find as I suspect there are millions of files on a zfs filesystem that is causing slow performance sometimes on a particular zfs file system

Thanks.
  • 1. Re: How to count number of files on zfs filesystem
    abrante Pro
    Currently Being Moderated
    Good question!

    What about

    df -g

    .. or is that Solaris 11?

    .7/M.
  • 2. Re: How to count number of files on zfs filesystem
    cindys Pro
    Currently Being Moderated
    Good question.

    I'm still doing some research, haven't found a good solution yet,
    but this seems to work:

    # cd /pond/cdata
    # # find . -name "*" | wc -l
    1785

    I can't comment on the performance in a very large file system though.

    Thanks, Cindy
  • 3. Re: How to count number of files on zfs filesystem
    bigdelboy Pro
    Currently Being Moderated
    Just to comment I am not sure in itself having millions of files on (or even tens or millions) of files on a zfs filesysem in iteself will lead to bad performance.

    However attempting to access a lot of files may leave to bad performance (eg attempting to run a file based backup). And if the filesystem is overfull (eg 80%?) or there is little space for zfs cache that that might also cause performance problems. And simply trying to read to many times or too much or getting contention with elsewhere may be a problem.

    I am of course open to comments ....
  • 4. Re: How to count number of files on zfs filesystem
    Nik Expert
    Currently Being Moderated
    Hi.

    df -t <mount_point> show how many free i-node present and total i-node number. So you can calculate how many i-nodes (files) used.

    df -t <mount_point> | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'


    Regards.
  • 5. Re: How to count number of files on zfs filesystem
    KirkMcNeilFJ Newbie
    Currently Being Moderated
    Initial research I did were pointing to people using df -g and doing the same calculations using free and total inodes. As far as I know zfs does not use inodes so I'm very skeptic in using something that reports used and free inodes on a zfs filesystem. Output of a quick test I did below using the df -t method, close but can anyone account for the descrepancy of 3. If it varies like this on a small test, maybe it will vary more on the actual file system


    bash-3.00# zfs list
    NAME USED AVAIL REFER MOUNTPOINT
    pool1 6.06G 792M 6.06G /test

    bash-3.00# find /test -name '*' | wc -l
    20
    bash-3.00# df -t /test | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
    23

    bash-3.00# find /test | nl
    1 /test
    2 /test/testme2
    3 /test/file006
    4 /test/testme
    5 /test/file012
    6 /test/file005
    7 /test/file008
    8 /test/file002
    9 /test/file011
    10 /test/t03
    11 /test/file013
    12 /test/t01
    13 /test/file007
    14 /test/file3
    15 /test/file014
    16 /test/file010
    17 /test/t02
    18 /test/file003
    19 /test/file009
    20 /test/file004
    bash-3.00#

    bash-3.00# df -t /test
    /test (pool1 ): 1621803 blocks 1621803 files
    total: 14321664 blocks 1621826 files
    bash-3.00#
  • 6. Re: How to count number of files on zfs filesystem
    Nik Expert
    Currently Being Moderated
    Hi.
    I test this commands on more large FS ( more 1000 files ).
    Yes, result is not same as find, but difference is not so mach ( ~ 10 ) .
    It was running on production FS, so some files was created, some files was removed. So it's depend what reason for this test.

    find - can work long time. Can give uncorrect result for hard-links or soft-links ( but it's depend what you what as result )

    df - work very fast. It's require some additioanl small test about snapshot (how it change number of used i-nodes).

    Regards.
  • 7. Re: How to count number of files on zfs filesystem
    KirkMcNeilFJ Newbie
    Currently Being Moderated
    Just created 6000+ small 1 meg files for testing. The df -t method seems to hold up nicely.

    bash-3.00# zfs list
    NAME USED AVAIL REFER MOUNTPOINT
    pool1 6.09G 758M 6.09G /test

    bash-3.00# pwd
    /test

    bash-3.00# find . | nl
    ...
    ...
    ...
    6210 ./file3834
    6211 ./file3419
    6212 ./file3843
    6213 ./file2206
    6214 ./file4694
    6215 ./file6034
    6216 ./file2392
    6217 ./file4700
    6218 ./file4777
    6219 ./file6043
    6220 ./file2271
    6221 ./file183
    6222 ./file4799

    bash-3.00# df -t /test | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
    6225
    bash-3.00#

    df -t
    bash-3.00# df -t /test
    /test (pool1 ): 1553191 blocks 1553191 files
    total: 14321664 blocks *1559416* files

    I suppose the real test would be to see if I can create more than 1.5 million small files and see how the df -t method holds up. I will post some more feedback when I do
  • 8. Re: How to count number of files on zfs filesystem
    cindys Pro
    Currently Being Moderated
    Good discussion so far but I would like to summarize a few points:

    1. ZFS can handle millions of files. We have build servers with this load and they perform fine.
    If you have millions of files in one directory and are trying to scan them, then yes, that config
    probably won't perform well.

    2. ZFS file systems are not directly tied to disk space particularly in a pool built with whole
    disks so you generally don't suffer from full file systems. You might suffer from full pools.
    Pool capacity greater than 80% can be a problem.

    If a file system is constrained by a quota or someone using their full reservation, then it
    can be full. Or, if you have a pool that is created on one slice and there's only 1 file system in
    the pool, then yes the file system could be full, but really, the pool is full.

    3. Although the df -t with the awk syntax works to identify the number of files, neither du
    or df have been modified to handle all ZFS descendent file systems, snapshots and so on.
    I'm not sure you could depend on these commands to accurately account for all space,
    unless the file system just included files.

    4. ZFS doesn't use the term inode internally to represent a file, it uses some kind of object
    name. In the external file system, though an inode still represents a ZFS file, size, and
    so on and can be displayed with the ls command and similar.

    My simple testing was like this:

    1. Identify number of files in /usr/include:

    # cd /usr/include
    # find . -name "*" | wc -l
    7929

    2. Create a new ZFS file system and copy over /usr/include to the file system:

    # zfs create tank/data
    # find /usr/include | cpio -pdmu /tank/data
    189072 blocks
    # cd /tank/data
    I think the slight discrepancy in the total number is due to directory entries.
    # find . -name "*" | wc -l
    7931
    # df -t /tank/data | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
    7938

    If I add new files to this file system, you can see the inode numbers increment. This
    is only because this file system was new and no files are being removed. Inode numbers
    can be recycled:

    # cp /usr/dict/words /tank/data/file.1
    # ls -i /tank/data/file.1
    7939 /tank/data/file.1
    # cp /usr/dict/words /tank/data/file.2
    # ls -i /tank/data/file.2
    7940

    You can review our recommendations about full pools and using other commands beside
    du and df to identify ZFS space allocation here:

    http://docs.oracle.com/cd/E23823_01/html/819-5461/practice-1.html#scrolltoc

    Thanks, Cindy
  • 9. Re: How to count number of files on zfs filesystem
    KirkMcNeilFJ Newbie
    Currently Being Moderated
    So I have finished 90% of my testing and I have accepted _df -t /filesystem | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'_ as acceptable in the absence of a known built in zfs method. My main conern was with the reduction of available files from the df -t output as more files were added. I used a one liner for loop to just create empty files to conserve on space used up so I would have a better chance of seeing what happens if the available files reached 0.

    root@fj-sol11:/zfstest/dir4# df -t /zfstest | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
    _5133680_
    root@fj-sol11:/zfstest/dir4# df -t /zfstest
    /zfstest (pool1 ): 7237508 blocks *7237508* files
    total: 10257408 blocks 12372310 files
    root@fj-sol11:/zfstest/dir4#

    root@fj-sol11:/zfstest/dir7# df -t /zfstest | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
    _6742772_
    root@fj-sol11:/zfstest/dir7# df -t /zfstest
    /zfstest (pool1 ): 6619533 blocks *6619533* files
    total: 10257408 blocks 13362305 files

    root@fj-sol11:/zfstest/dir7# df -t /zfstest | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
    _7271716_
    root@fj-sol11:/zfstest/dir7# df -t /zfstest
    /zfstest (pool1 ): 6445809 blocks *6445809* files
    total: 10257408 blocks 13717010 files

    root@fj-sol11:/zfstest# df -t /zfstest | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
    _12359601_
    root@fj-sol11:/zfstest# df -t /zfstest
    /zfstest (pool1 ): 4494264 blocks *4494264* files
    total: 10257408 blocks 16853865 files

    I noticed the total files kept increasing and the creation of 4 millions files (4494264) after the above example was taking up more time than I had after already creating 12 million plus ( _12359601_ ) which took 2 days on a slow machine on and off (mostly on). If anyone has any idea of creating them quicker than "touch filename$loop" in a for loop let me know :)

    In the end I decided to use a really small file system 100mb on a virtual machine to test what happens as the free files approached 0. Turns out if never does ... it somehow increased

    bash-3.00# df -t /smalltest/
    /smalltest (smalltest ): 31451 blocks *31451* files
    total: 112640 blocks 278542 files
    bash-3.00# pwd
    /smalltest
    bash-3.00# mkdir dir4
    bash-3.00# cd dir4
    bash-3.00# for arg in {1..47084}; do touch file$arg; done <--- I created 47084 files here, more that the free listed above ( *31451* )
    bash-3.00# zfs list smalltest
    NAME USED AVAIL REFER MOUNTPOINT
    smalltest 47.3M 7.67M 46.9M /smalltest
    bash-3.00# df -t /smalltest/
    /smalltest (smalltest ): 15710 blocks *15710* files
    total: 112640 blocks 309887 files
    bash-3.00#

    The other 10% of my testing will be to see what happens when I try to a find on 12 million plus files and try to pipe it to wc -l :)
  • 10. Re: How to count number of files on zfs filesystem
    KirkMcNeilFJ Newbie
    Currently Being Moderated
    Testing complete, on my system it took 4 hrs plus to do a count using find.

    root@fj-sol11:~# timex find /zfstest | wc -l

    real 4:18:02.61
    user 1:37.78
    sys 29:04.48

    *12365908*

    root@fj-sol11:~# timex df -t /zfstest | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
    *12365914*

    real 0.02
    user 0.00
    sys 0.00

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points