7 Replies Latest reply: Nov 14, 2012 5:16 AM by darke RSS

    algoritm design

    814889
      I have to write a programm where i have to compare some n number of files , where each file has key,value data , now i want to compare all the files get all the lines which have common keys in it , say

      file1 file2 file3 file
      tim,chase tom,someting tom,wright chase,w
      tom,jerry vinay,b sachin,b tom,m

      out put would be
      tom,chase
      tom,jerry
      tom,wright
      tom,m

      what be the optimized algorithm to do this task.

      My logic :

      Start

      Read each line from file and store it in memory as array (eg. array1[0] = "A", array1[1] = "B" and so on.

      Since there are 4 files, I create 4 arrays = array1 to array4. Each of them will have the contents of their corresponding files.

      Now I will compare the first words in the first array with the first word in the second array.

      Now I will compare the first words in the first array with the second word in the second array and so on till the end on second array.

      I will continue this till the last word in the last array.

      When ever I found something was matching i will populate this in Arraylist


      can anybody let me know better design or a complete desgin or altrenative or is this good enough
        • 1. Re: algoritm design
          jschellSomeoneStoleMyAlias
          The explanation of the problem is not clear.

          Provide an example of file1 and file2 distinct and different.
          For each file provide the lines that match and lines that do not match.

          Then provide an explanation of what a match means and/or what a non-match means.
          And don't use the same character (comma) both for the input and output. Doing so confuses what is just a separator and what is part of the data.
          • 2. Re: algoritm design
            814889
            file1
            tim,chase
            tom,jerry

            file2
            tom,someting
            vinay,b


            file3
            tom,wright
            sachin,b

            file4

            chase,w
            tom,m


            basically all files containt key,value pair entries


            i want all common key with
            values in a file in this case it is tom

            desired output


            tom,jerry
            tom,wright
            tom,m
            • 3. Re: algoritm design
              darke
              Vicky wrote:
              file1
              tim,chase
              tom,jerry

              file2
              tom,someting
              vinay,b


              file3
              tom,wright
              sachin,b

              file4

              chase,w
              tom,m


              basically all files containt key,value pair entries


              i want all common key with
              values in a file in this case it is tom

              desired output


              tom,jerry
              tom,wright
              tom,m
              You probably just need a single data-structure (A Map<String,List<String>>)

              read file1 , populate the map with it's values .

              So the map looks like
              tim -> chase
              tom -> jerry
              after reading file1 in your example.

              read files 2-N . If the file contains a key not in the map , ignore that pair . Else add the value to the list referenced by the key.

              So after reading file2 , the map will contain
              tim -> chase
              tom ->jerry,someting
              the key vinay is ignored .

              After you have read all files , discard the entries in the map which do not correspond to a list of size N .
              • 4. Re: algoritm design
                814889
                I think it quite imcomplete my intention is to get all commonkeys present in all 4 files, so lets if i leave vinay key as u mentioned and what if i add 5th file tommorow and it has vinay that time this will not work
                • 5. Re: algoritm design
                  darke
                  Vicky wrote:
                  I think it quite imcomplete my intention is to get all commonkeys present in all 4 files, so lets if i leave vinay key as u mentioned and what if i add 5th file tommorow and it has vinay that time this will not work
                  unless you change file1 , file3 and file4 to include the key vinay , you won't have it as common to all files . Or I don't understand your requirement :) .

                  What do you need -

                  Keys that occur in every file - so common to all files ?
                  OR
                  Keys that occur in more than 1 file ?
                  • 6. Re: algoritm design
                    814889
                    Keys that occur in more than 1 file :)

                    Edited by: Vicky on Nov 13, 2012 10:15 PM
                    • 7. Re: algoritm design
                      darke
                      So don't ignore any pairs , and in the final step , only discard entries which correspond to a list of size 1 i.e. occurring only in 1 file ..