10 Replies Latest reply: Dec 11, 2012 9:18 AM by KirkMcNeilFJ RSS

    How to count number of files on zfs filesystem

    KirkMcNeilFJ
      Hi all,
      Is there a way to count the number of files on a zfs filesystem similar to how "df -o i /ufs_filesystm" works? I am looking for a way to do this without using find as I suspect there are millions of files on a zfs filesystem that is causing slow performance sometimes on a particular zfs file system

      Thanks.
        • 1. Re: How to count number of files on zfs filesystem
          abrante
          Good question!

          What about

          df -g

          .. or is that Solaris 11?

          .7/M.
          • 2. Re: How to count number of files on zfs filesystem
            Cindys-Oracle
            Good question.

            I'm still doing some research, haven't found a good solution yet,
            but this seems to work:

            # cd /pond/cdata
            # # find . -name "*" | wc -l
            1785

            I can't comment on the performance in a very large file system though.

            Thanks, Cindy
            • 3. Re: How to count number of files on zfs filesystem
              bigdelboy
              Just to comment I am not sure in itself having millions of files on (or even tens or millions) of files on a zfs filesysem in iteself will lead to bad performance.

              However attempting to access a lot of files may leave to bad performance (eg attempting to run a file based backup). And if the filesystem is overfull (eg 80%?) or there is little space for zfs cache that that might also cause performance problems. And simply trying to read to many times or too much or getting contention with elsewhere may be a problem.

              I am of course open to comments ....
              • 4. Re: How to count number of files on zfs filesystem
                Nik
                Hi.

                df -t <mount_point> show how many free i-node present and total i-node number. So you can calculate how many i-nodes (files) used.

                df -t <mount_point> | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'


                Regards.
                • 5. Re: How to count number of files on zfs filesystem
                  KirkMcNeilFJ
                  Initial research I did were pointing to people using df -g and doing the same calculations using free and total inodes. As far as I know zfs does not use inodes so I'm very skeptic in using something that reports used and free inodes on a zfs filesystem. Output of a quick test I did below using the df -t method, close but can anyone account for the descrepancy of 3. If it varies like this on a small test, maybe it will vary more on the actual file system


                  bash-3.00# zfs list
                  NAME USED AVAIL REFER MOUNTPOINT
                  pool1 6.06G 792M 6.06G /test

                  bash-3.00# find /test -name '*' | wc -l
                  20
                  bash-3.00# df -t /test | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
                  23

                  bash-3.00# find /test | nl
                  1 /test
                  2 /test/testme2
                  3 /test/file006
                  4 /test/testme
                  5 /test/file012
                  6 /test/file005
                  7 /test/file008
                  8 /test/file002
                  9 /test/file011
                  10 /test/t03
                  11 /test/file013
                  12 /test/t01
                  13 /test/file007
                  14 /test/file3
                  15 /test/file014
                  16 /test/file010
                  17 /test/t02
                  18 /test/file003
                  19 /test/file009
                  20 /test/file004
                  bash-3.00#

                  bash-3.00# df -t /test
                  /test (pool1 ): 1621803 blocks 1621803 files
                  total: 14321664 blocks 1621826 files
                  bash-3.00#
                  • 6. Re: How to count number of files on zfs filesystem
                    Nik
                    Hi.
                    I test this commands on more large FS ( more 1000 files ).
                    Yes, result is not same as find, but difference is not so mach ( ~ 10 ) .
                    It was running on production FS, so some files was created, some files was removed. So it's depend what reason for this test.

                    find - can work long time. Can give uncorrect result for hard-links or soft-links ( but it's depend what you what as result )

                    df - work very fast. It's require some additioanl small test about snapshot (how it change number of used i-nodes).

                    Regards.
                    • 7. Re: How to count number of files on zfs filesystem
                      KirkMcNeilFJ
                      Just created 6000+ small 1 meg files for testing. The df -t method seems to hold up nicely.

                      bash-3.00# zfs list
                      NAME USED AVAIL REFER MOUNTPOINT
                      pool1 6.09G 758M 6.09G /test

                      bash-3.00# pwd
                      /test

                      bash-3.00# find . | nl
                      ...
                      ...
                      ...
                      6210 ./file3834
                      6211 ./file3419
                      6212 ./file3843
                      6213 ./file2206
                      6214 ./file4694
                      6215 ./file6034
                      6216 ./file2392
                      6217 ./file4700
                      6218 ./file4777
                      6219 ./file6043
                      6220 ./file2271
                      6221 ./file183
                      6222 ./file4799

                      bash-3.00# df -t /test | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
                      6225
                      bash-3.00#

                      df -t
                      bash-3.00# df -t /test
                      /test (pool1 ): 1553191 blocks 1553191 files
                      total: 14321664 blocks *1559416* files

                      I suppose the real test would be to see if I can create more than 1.5 million small files and see how the df -t method holds up. I will post some more feedback when I do
                      • 8. Re: How to count number of files on zfs filesystem
                        Cindys-Oracle
                        Good discussion so far but I would like to summarize a few points:

                        1. ZFS can handle millions of files. We have build servers with this load and they perform fine.
                        If you have millions of files in one directory and are trying to scan them, then yes, that config
                        probably won't perform well.

                        2. ZFS file systems are not directly tied to disk space particularly in a pool built with whole
                        disks so you generally don't suffer from full file systems. You might suffer from full pools.
                        Pool capacity greater than 80% can be a problem.

                        If a file system is constrained by a quota or someone using their full reservation, then it
                        can be full. Or, if you have a pool that is created on one slice and there's only 1 file system in
                        the pool, then yes the file system could be full, but really, the pool is full.

                        3. Although the df -t with the awk syntax works to identify the number of files, neither du
                        or df have been modified to handle all ZFS descendent file systems, snapshots and so on.
                        I'm not sure you could depend on these commands to accurately account for all space,
                        unless the file system just included files.

                        4. ZFS doesn't use the term inode internally to represent a file, it uses some kind of object
                        name. In the external file system, though an inode still represents a ZFS file, size, and
                        so on and can be displayed with the ls command and similar.

                        My simple testing was like this:

                        1. Identify number of files in /usr/include:

                        # cd /usr/include
                        # find . -name "*" | wc -l
                        7929

                        2. Create a new ZFS file system and copy over /usr/include to the file system:

                        # zfs create tank/data
                        # find /usr/include | cpio -pdmu /tank/data
                        189072 blocks
                        # cd /tank/data
                        I think the slight discrepancy in the total number is due to directory entries.
                        # find . -name "*" | wc -l
                        7931
                        # df -t /tank/data | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
                        7938

                        If I add new files to this file system, you can see the inode numbers increment. This
                        is only because this file system was new and no files are being removed. Inode numbers
                        can be recycled:

                        # cp /usr/dict/words /tank/data/file.1
                        # ls -i /tank/data/file.1
                        7939 /tank/data/file.1
                        # cp /usr/dict/words /tank/data/file.2
                        # ls -i /tank/data/file.2
                        7940

                        You can review our recommendations about full pools and using other commands beside
                        du and df to identify ZFS space allocation here:

                        http://docs.oracle.com/cd/E23823_01/html/819-5461/practice-1.html#scrolltoc

                        Thanks, Cindy
                        • 9. Re: How to count number of files on zfs filesystem
                          KirkMcNeilFJ
                          So I have finished 90% of my testing and I have accepted _df -t /filesystem | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'_ as acceptable in the absence of a known built in zfs method. My main conern was with the reduction of available files from the df -t output as more files were added. I used a one liner for loop to just create empty files to conserve on space used up so I would have a better chance of seeing what happens if the available files reached 0.

                          root@fj-sol11:/zfstest/dir4# df -t /zfstest | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
                          _5133680_
                          root@fj-sol11:/zfstest/dir4# df -t /zfstest
                          /zfstest (pool1 ): 7237508 blocks *7237508* files
                          total: 10257408 blocks 12372310 files
                          root@fj-sol11:/zfstest/dir4#

                          root@fj-sol11:/zfstest/dir7# df -t /zfstest | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
                          _6742772_
                          root@fj-sol11:/zfstest/dir7# df -t /zfstest
                          /zfstest (pool1 ): 6619533 blocks *6619533* files
                          total: 10257408 blocks 13362305 files

                          root@fj-sol11:/zfstest/dir7# df -t /zfstest | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
                          _7271716_
                          root@fj-sol11:/zfstest/dir7# df -t /zfstest
                          /zfstest (pool1 ): 6445809 blocks *6445809* files
                          total: 10257408 blocks 13717010 files

                          root@fj-sol11:/zfstest# df -t /zfstest | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
                          _12359601_
                          root@fj-sol11:/zfstest# df -t /zfstest
                          /zfstest (pool1 ): 4494264 blocks *4494264* files
                          total: 10257408 blocks 16853865 files

                          I noticed the total files kept increasing and the creation of 4 millions files (4494264) after the above example was taking up more time than I had after already creating 12 million plus ( _12359601_ ) which took 2 days on a slow machine on and off (mostly on). If anyone has any idea of creating them quicker than "touch filename$loop" in a for loop let me know :)

                          In the end I decided to use a really small file system 100mb on a virtual machine to test what happens as the free files approached 0. Turns out if never does ... it somehow increased

                          bash-3.00# df -t /smalltest/
                          /smalltest (smalltest ): 31451 blocks *31451* files
                          total: 112640 blocks 278542 files
                          bash-3.00# pwd
                          /smalltest
                          bash-3.00# mkdir dir4
                          bash-3.00# cd dir4
                          bash-3.00# for arg in {1..47084}; do touch file$arg; done <--- I created 47084 files here, more that the free listed above ( *31451* )
                          bash-3.00# zfs list smalltest
                          NAME USED AVAIL REFER MOUNTPOINT
                          smalltest 47.3M 7.67M 46.9M /smalltest
                          bash-3.00# df -t /smalltest/
                          /smalltest (smalltest ): 15710 blocks *15710* files
                          total: 112640 blocks 309887 files
                          bash-3.00#

                          The other 10% of my testing will be to see what happens when I try to a find on 12 million plus files and try to pipe it to wc -l :)
                          • 10. Re: How to count number of files on zfs filesystem
                            KirkMcNeilFJ
                            Testing complete, on my system it took 4 hrs plus to do a count using find.

                            root@fj-sol11:~# timex find /zfstest | wc -l

                            real 4:18:02.61
                            user 1:37.78
                            sys 29:04.48

                            *12365908*

                            root@fj-sol11:~# timex df -t /zfstest | awk ' { if ( NR==1) F=$(NF-1) ; if ( NR==2) print $(NF-1) - F }'
                            *12365914*

                            real 0.02
                            user 0.00
                            sys 0.00