This discussion is archived
10 Replies Latest reply: Mar 18, 2008 7:51 AM by 807601 RSS

File comparison

807601 Newbie
Currently Being Moderated
Hey all,

i need to compare about 500 .html files to eachother and they need to totally unique.
the file system is:
-Main Directory
    - Sub1Directory
        -Sub2Directory
            -file.html
        -Sub2Directory
            -file.html
    - Sub1Directory
        -Sub2Directory
            -file.html
         -Sub2Directory
           -file.html
all the files in Sub1Directory need to be compared to each other.
          try{
               Iterator<File> it = list.iterator();
               String current = "";
               int startIndex = 0;
               while(it.hasNext()){
                    File key = (File)it.next();
                    writer.write("Comparing" + key.getParent() + "\n");
                    Object[] st1 = getFileContents(key);
                    int i = startIndex;
                    while(key.getParent().contains(current) && i < list.size()){
                         writer.write("To        " + list.get(i).getParent() + "\n");
                         Object[] st2 = getFileContents(list.get(i));
                         if(!key.getParent().equals(list.get(i).getParent())){
                              if(Arrays.equals(st1, st2)){
                                   System.out.println(key.getParent() + "/" + key.getName());
                                   System.out.println("equals" + list.get(i).getParent() + "/" + list.get(i).getName());
                              }
                         }
                         i++;
                    }
                    startIndex++;
               }
          }
          catch(Exception e){}
the list in this code is contains all of the index.html files inside Sub1directory.

there are 2 problems:
1 - it is very, very slow
2 - i just don't trust the code, because it sais every file is unique, but i know it isn't

help would be appreciated

Greetz

Grad_
  • 1. Re: File comparison
    807601 Newbie
    Currently Being Moderated
    private Object[] getFileContents(File f){
              ArrayList<String> list = new ArrayList<String>();
    
              try{
                   BufferedReader reader = new BufferedReader(new FileReader(f));
    
                   String inputLine = "";
    
                   while((inputLine = reader.readLine()) != null){
                        list.add(inputLine);
                   }
    
    
              }
              catch(Exception e){
    
              }
    
              return list.toArray();
         }
    the code for the getFileContents methof
  • 2. Re: File comparison
    807601 Newbie
    Currently Being Moderated
    You can compare the size of the file by looking at:
    myFile.length
    If they have diffrent sizes the nthey are diffrent.
    Furthermore, you can start by looking at all file sizes sort them then only compare the content of files with the same size.

    Regards,
  • 3. Re: File comparison
    807601 Newbie
    Currently Being Moderated
    The file length are allowed to be the same, the content of the files need to be unique!

    so if file #1 has a length of 500 and contains only the letter "a" and file #2 has a length of 500 but one letter is a "b", they are not equal.
  • 4. Re: File comparison
    807601 Newbie
    Currently Being Moderated
    Read carefully what Nightman150 wrote...
  • 5. Re: File comparison
    807601 Newbie
    Currently Being Moderated
    I mean't if two files have diffrent contents then they are diffrent, no need to check the content and compare it.
  • 6. Re: File comparison
    807601 Newbie
    Currently Being Moderated
    so
    if(file1.length() == file2.length())
        System.out.println("files are completely the same")
    this would be correct?
  • 7. Re: File comparison
    807601 Newbie
    Currently Being Moderated
    no i meant this:
    if(file1.length() != file2.length())
        System.out.println("files are completely diffrent");
    
    if(file1.length() == file2.length()) {
        System.out.println("files may be the same");
    
    // continue here, look at content to compare files
    
    }
    Edited by: Nightman150 on Mar 18, 2008 3:01 PM
  • 8. Re: File comparison
    807601 Newbie
    Currently Being Moderated
    thanks Nightman150, this would really speed up the program!
    i can handle it from here :)
  • 9. Re: File comparison
    807601 Newbie
    Currently Being Moderated
    I'm a bit puzzled by your code. For instance where does current get updated? Looks to me as if key.getParent().contains(current) is always testing if it contains "".
  • 10. Re: File comparison
    807601 Newbie
    Currently Being Moderated
    private void compareFile(ArrayList<File> list){
              try{
                   Object[] array = list.toArray();
                   Arrays.sort(array);
                   int startIndex = 0;
                   while(startIndex < array.length){
                        File compareFile = (File)array[startIndex];
                        int i = startIndex;
                        while(i < array.length){
                             File compareToFile = (File)array;
                             if(compareFile.length() == compareToFile.length()){
                                  if(!compareFile.getParent().equals(compareToFile.getParent())){
                                       Object[] st1 = getFileContents(compareFile);
                                       Object[] st2 = getFileContents(compareToFile);                                   
                                       if(Arrays.equals(st1, st2)){
                                            System.out.println(compareFile.getParent().substring(compareFile.getParent().lastIndexOf("/"))+".equals"+(compareToFile.getParent().substring(compareToFile.getParent().lastIndexOf("/"))));
                                       }
                                  }                              
                             }
                             i++;
                        }
                        startIndex++;
                   }
              }
              catch(Exception e){}
         }

         private Object[] getFileContents(File f){
              ArrayList<String> list = new ArrayList<String>();
              try{
                   BufferedReader reader = new BufferedReader(new FileReader(f));

                   String inputLine = "";

                   while((inputLine = reader.readLine()) != null){
                        list.add(inputLine);
                   }
              }
              catch(Exception e){}
              return list.toArray();
         }
    this is my updated code, it is quite fast and it seems to work.
    
    thanks again Nightman150