arg list too long, on MV command with directory holding 30.000+ files

807567
    I'm trying to move all files, in a given directory, to another directory. Special thing about is, is that this directory holds a lot of files, in this case somewhat more than 30.000 files. The command runs for a while, then states "arg list too long", and stops. No file at all is being copied !
      • 1. Re: arg list too long, on MV command with directory holding 30.000+ files
        807567
        You can solve this problem using less number of files to move at a single try. The script below makes it for you.

        1. create a script with the contents below the ##### line
        2. edit DSTDIR and SRCDIR variables to match your source and destination directories
        3. save the script and chmod +x it.
        4. execute it giving the number of files to copy in a single step as the first argument for example: ./script.sh 500 (will move 500 files from source to destdir).
        5. execute the script several times until all files relocate to the new directory.

        ####################################################



        #!/bin/sh

        DSTDIR=/home/mef/nisbackup/test/
        SRCDIR=/home/mef/nisbackup/src/

        if [ $1 ]
        then

        SALUT=`/usr/bin/ls -a -1 |tail -$1 | xargs`

        for FILE in $SALUT
        do
        /usr/bin/mv $SRCDIR/$FILE $DSTDIR
        done

        else
        echo 'usage: ./$0 chunk_size'
        exit 1;
        fi
        • 2. Re: arg list too long, on MV command with directory holding 30.000+ files
          807567
          You can try something like this:
          tar -cf - * | (cd ../dst; tar -xf -)
          For this example, cd into your source directory. the command written in the sub-shell assumes your destination directory is ../dst. Modify this to fit your environment.
          • 3. Re: arg list too long, on MV command with directory holding 30.000+ files
            807567
            Thanks,
            I'll have a look at both proposed solutions


            Any chance that Solaris may solve this bug ?
            • 4. Re: arg list too long, on MV command with directory holding 30.000+ files
              807567
              This is not a bug, but a feature rather.

              Every UNIX shell has it's argument list limit.
              • 5. Re: arg list too long, on MV command with directory holding 30.000+ files
                807567
                99480 wrote:
                I'm trying to move all files, in a given directory, to another directory. Special thing about is, is that this directory holds a lot of files, in this case somewhat more than 30.000 files. The command runs for a while, then states "arg list too long", and stops. No file at all is being copied !
                Are there no "old school" system administrators left? :-)

                Assume that the source directory is /export/home/ and the new directory is /local/home
                # cd /export
                # find home -print | cpio -pdmv /local/
                # pwd
                /export
                # rm -r home

                Cheers,

                Edited by: John.Kotches on Apr 18, 2008 4:42 PM
                • 6. Re: arg list too long, on MV command with directory holding 30.000+ files
                  800381
                  Assuming no subdirectories, you can also use the "find" command:
                  find . -exec mv {} /dest/dir \;
                  And it might even work for subdirectories, but I'm not going to guarantee it.
                  • 7. Re: arg list too long, on MV command with directory holding 30.000+ files
                    807567
                    AndrewHenle wrote:
                    Assuming no subdirectories, you can also use the "find" command:
                    find . -exec mv {} /dest/dir \;
                    And it might even work for subdirectories, but I'm not going to guarantee it.
                    When you use find -exec you fork a shell for each file which is quite CPU intensive. Not much different than my find in practical use.

                    Cheers,
                    • 8. Re: arg list too long, on MV command with directory holding 30.000+ files
                      800381
                      John.Kotches wrote:
                      AndrewHenle wrote:
                      Assuming no subdirectories, you can also use the "find" command:
                      find . -exec mv {} /dest/dir \;
                      And it might even work for subdirectories, but I'm not going to guarantee it.
                      When you use find -exec you fork a shell for each file which is quite CPU intensive. Not much different than my find in practical use.

                      Cheers,
                      I think there are three practical differences.

                      First, by using cpio to do a copy and then having to come back and delete the original files, if the source and target directories are located on the same file system there has to be enough empty space to hold an extra copy of the entire directory. And if the source and target directory do happen to be on the same file system, a simple rename() is all that is necessary to move the file, which has minimal CPU utilization as the actual contents of the file never move on the physical disk. An actual copy of the file data would be unnecessary in this case. So, yes, while a fork() and rename() does have some overhead, it's less than a single child process recreating each file and copying its data over to the new copy. And if the source and target directory are on separate disks, the CPU cost of the file creation and data copy dwarfs the impact of a fork() for each file anyway.

                      Second, by accomplishing the task of moving the files with one command, there's no opportunity to mistakenly type "rm -f -r *" in the wrong location.

                      Third, your method will work for directory trees. I'm not so sure about mine.
                      • 9. Re: arg list too long, on MV command with directory holding 30.000+ files
                        807567
                        AndrewHenle wrote:
                        I think there are three practical differences.

                        First, by using cpio to do a copy and then having to come back and delete the original files, if the source and target directories are located on the same file system there has to be enough empty space to hold an extra copy of the entire directory. And if the source and target directory do happen to be on the same file system, a simple rename() is all that is necessary to move the file, which has minimal CPU utilization as the actual contents of the file never move on the physical disk. An actual copy of the file data would be unnecessary in this case. So, yes, while a fork() and rename() does have some overhead, it's less than a single child process recreating each file and copying its data over to the new copy. And if the source and target directory are on separate disks, the CPU cost of the file creation and data copy dwarfs the impact of a fork() for each file anyway.
                        This depends on the number of files.... If we're talking a few files yes. 30K files? Well that's a little different.
                        Second, by accomplishing the task of moving the files with one command, there's no opportunity to mistakenly type "rm -f -r *" in the wrong location.
                        Which is why I suggested the pwd first ;-) Having two copies of the data also acts as a safety net.

                        Third, your method will work for directory trees. I'm not so sure about mine.
                        I never use mv for bulk moves including subtrees. I use either cpio or cp -pr whichever suits the task better. Given the size of modern disk hardware unless it's a "stupid big" move I don't worry about a temporary second copy.
                        • 10. Re: arg list too long, on MV command with directory holding 30.000+ files
                          807567
                          I need to have a look at the latest updates, I'm still hitting the same bug. I'm also trying to look how many files are in the directory, but the "ls" command fails as well. I don't consider this a bug, I think it is a severe bug ... a cp or mv command not working, I consider pretty nasty. Now having issues with directories containing 120.000 files in them. If the system commands cannot handle that amount of files, they should not allow them to be created therein. Of course, the bug would become more obvious then. Anyway, I'm considering a compression tool to work around this bug.
                          • 11. Re: arg list too long, on MV command with directory holding 30.000+ files
                            user4994457
                            99480 wrote:
                            I need to have a look at the latest updates, I'm still hitting the same bug. I'm also trying to look how many files are in the directory, but the "ls" command fails as well.
                            How is 'ls' being executed, and what error are you getting? You may want to use 'ls -f', otherwise ls will try to sort the filenames, and that's not a good idea when you have multi-thousands of files.
                            I don't consider this a bug, I think it is a severe bug
                            It's not a bug. There are things the system is not supposed to be able to do. This is one of them.
                            ... a cp or mv command not working, I consider pretty nasty.

                            Now having issues with directories containing 120.000 files in them. If the system commands cannot handle that amount of files, they should not allow them to be created therein. Of course, the bug would become more obvious then. Anyway, I'm considering a compression tool to work around this bug.
                            There is a difference between the limits of files in a directory and the argument limits to a process. They are not related.

                            Argument list size failures are documented for exec() functions.
                            # man execve
                            [...]
                                 The exec functions will fail if:
                            
                                 E2BIG The number of bytes in the new process's argument list
                                       is  greater  than  the system-imposed limit of ARG_MAX
                                       bytes. The argument list limit is sum of the  size  of
                                       the  argument  list plus the size of the environment's
                                       exported shell variables.
                            # grep ARG_MAX /usr/include/limits.h
                            * ARG_MAX is calculated as follows:
                            #define ARGMAX32 1048320 /* max length of args to exec 32-bit program */
                            #define ARGMAX64 2096640 /* max length of args to exec 64-bit program */
                            #define ARG_MAX ARGMAX64 /* max length of arguments to exec */
                            #define ARG_MAX ARGMAX32 /* max length of arguments to exec */
                            #define POSIXARG_MAX 4096

                            --
                            Darren