5 Replies Latest reply: Feb 19, 2010 6:04 AM by 843853 RSS

    Psuedo code to java, Finding a pattern in a suffix array?

      Hey I hope this is the right place for this....

      I found the following psuedo code online, dealing with searching for a pattern in a suffix array:
      find(Pattern P in SuffixArray A):
        i = 0
        lo = 0, hi = length(A)
        for 0<=i<length(P):
          Binary search for x,y
          where P=S[A[j]+i] for lo<=x<=j<y<=hi
      lo = x, hi = y
      return {A[lo],A[lo+1],...,A[hi-1]}
      and I am trying to get relocate it to java, but im slighty stuck.  Not I can work out the simple stuff, and I have a binarysearch method, top example here (http://leepoint.net/notes-java/algorithms/searching/binarysearch.html)
      If possible can someone walk me through the steps in the psuedo code in the for loop?                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
        • 1. Re: Psuedo code to java, Finding a pattern in a suffix array?
          satory wrote:
          Not I can work out the simple stuff, and I have a binarysearch method
          If possible can someone walk me through the steps in the psuedo code in the for loop?
          If you can work out some of the "stuff", what parts can you not work out? Where did you get stuck?
          • 2. Re: Psuedo code to java, Finding a pattern in a suffix array?
            Sorry I actually should of pointed out the bits I needed help for...

            This is the line that has caught me:
            where P=S[A[j]+i] for lo<=x<=j<y<=hi
            • 3. Re: Psuedo code to java, Finding a pattern in a suffix array?
              Looks like the code is from:
              Do you understand how the 'Mississippi' example works without thinking about Java code?

              Do you have code to create the suffix array? You'll need that first.
              where P=S[A[j]+i] for lo<=x<=j<y<=hi
              This just says find x and y where this condition is true and implies that x and y are as far apart as possible.
              You are looking for suffixes in the suffix array between lo and hi where the i'th character of the suffix is equal to the i'th character of the pattern P. x is the smallest index and y is the largest index where this is true. Any j between x and y will also satisfy this condition.
              You could try implementing this part with a single for loop (check the condition for each j between lo and hi) first, then add a binary search.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
              • 4. Re: Psuedo code to java, Finding a pattern in a suffix array?
                Yeah I have the code to create a sorted suffix array...

                Okay with regards to the Mississippi example I understand what it returns what it does, though the working escapes me.

                I have actually tried to write down on the steps in the mississippi example but whats x and y? I actually though I was okay with that line as it was the next one that baffled me more, as I start to think about it I have no idea what x or y is.

                In the example how does he/she work out that i only appears in a mississippi suffix array 3 times? I could probably get away with running a loop through the array looking at the first char in the string and find the hi/lo values there, ie 0 and 3 in the example, but thats not great, is it? especially in regards to big suffix arrays?
                • 5. Re: Psuedo code to java, Finding a pattern in a suffix array?
                  If i occurs in indices from 0 to 3, then x is 0 and y is 3.

                  Instead of checking each suffix, you can exclude half the remaining possibilities using a binary search. Instead of looking for a whole string like your binary search example though, you are looking for the lowest or highest index where a character appears.

                  If i occurs as a first character between 0 and 3, you can find this by considering only the following indexes: