11 Replies Latest reply on Feb 20, 2009 5:51 PM by 843785

    counting consecutive strings

    843785
      Here's the thing:
      I have a list of words, followed by "/" and its type. So that looks like "work/vb". I'm only interested in the type, so I split on "/", and keep the "vb". If I now feed the program a text (words followed by their type), all I keep is something like: "vb-sd-ap-n-adj" etc. (the "-" isn't in there, that's just for readability here.)

      now I want to count the total number of occurrences of every particular type, so for the line above, the output should then be vb:1, sd:1, ap:1.....
      And then I want to count the occurrence of every transition. So say I have three types, 1,2 and 3, I want to count the occurrences of: (1-1, 1-2, 1-3)(2-1, 2-2, 2-3)(3-1, 3-2, 3-3). Problem is I have a little bit more than three types; they're 26. A bit of a nasty and highly inefficient job if I have to write out all possibilities (26*26).
      Does anyone have any ideas on how to do this in a generative way, with a loop or something. I could imagine a for loop comes in handy; I want a loop to go through all the consecutive strings and count every type, and then I want to count every transition from X to Z, where X can be any type and Z can be any type.
      Would be really cool if someone can tell me how to do this in an efficient way!

      cheers,
        • 1. Re: counting consecutive strings
          800282
          Igor_Pavlove wrote:
          Here's the thing:
          I have a list of words, followed by "/" and its type. So that looks like "work/vb". I'm only interested in the type, so I split on "/", and keep the "vb". If I now feed the program a text (words followed by their type), all I keep is something like: "vb-sd-ap-n-adj" etc. (the "-" isn't in there, that's just for readability here.)

          now I want to count the total number of occurrences of every particular type, so for the line above, the output should then be vb:1, sd:1, ap:1.....
          Have a look at the java.uitl.Map interface.
          It's implementations can store a KEY (your type) with a certain VALUE (the occurrence of type (the KEY)).
          And then I want to count the occurrence of every transition. So say I have three types, 1,2 and 3, I want to count the occurrences of: (1-1, 1-2, 1-3)(2-1, 2-2, 2-3)(3-1, 3-2, 3-3). Problem is I have a little bit more than three types; they're 26. A bit of a nasty and highly inefficient job if I have to write out all possibilities (26*26).
          A double for-statement will do:
          int[] types = ...
          for(int i = 0; i < types.length; i++) {
            for(int j = 0; j < types.length; j++) {
              System.out.println("Combination: "+types[i]+" "+types[j]);
            }
          }
          • 2. Re: counting consecutive strings
            800308
            I'd just use the brute force method (as you're doing) and int[][] matrix.

            26 * 26 = 676 is pretty tiny in modern computing terms.
               FROM               10                  20
              0 1 2 3 4 5 6 7 8 9   1 2 3 4 5 6 7 8 9   1 2 3 4 5
            T 1
            O 2
              3                                           2
              4
              5
              6
              7
              8
              9
            1
            0 1
              2
              3
              4           1
              5
              6
              7
              8
              9
            2
            0 1
              2
              3
              4
              5
              6
            
            From 6 to 14 occurred once.
            From 22 to 3 occurred twice.
            • 3. Re: counting consecutive strings
              843785
              Ok, thanks for the directions. Trying to figure it out, but need some more help.
              At the moment I've scraped this together:

              if (words.contains(" ")){
                               String [] array = words.split(" ");//delete any whitespace
                               
                               for (int i = 0; i < array.length; i++){
                                    jenna = array;//jenna now holds "word/type"
                        String hugh = null;
                        
                        if (jenna.contains("/")){
                             String [] anotherArray = jenna.split("\\/");//delete the "word/"
                             String jameson = anotherArray[1];
                             Set<String> types = Collections.unmodifiableSet(new HashSet<String>(Arrays.asList(anotherArray[1])));// types now should be a list holding all types in chrono order??
                        
                             int k = 0;
                             int l = 0;
                             for (int j = 0; j < types.size(); j++){
                                  while ((k = jameson.indexOf(jameson, k)) != -1) {//cycle through types, if string equals itself, go on and keep track of the count
                        ++k;
                        ++l;
              type.put(jameson, Integer.toString(l));//map the string to its count
                   }
                                  
                                  hugh = type.get(jameson);//get the count of the string
                             }

                             textArea.append(jameson + "-" + hugh + newline);
                             textField.selectAll();
                        }
                        
                   }
                   
              }
              I'd say this is kind of what I should do, but it gives me for the sample text the/at dog/nn sat/v on/in the/at porch/nn: 
              at-1
              nn-1
              v-1
              in-1
              at-1
              nn-1.
              
              Seems like I'm close, but as I said, need a little more help with it.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
              • 4. Re: counting consecutive strings
                800282
                What input are you using? What expected output had you in mind?
                • 5. Re: counting consecutive strings
                  843785
                  input: the/at dog/nn sat/v on/in the/at porch/nn.

                  output:
                  at-2
                  nn-2
                  v-1
                  in-1
                  • 6. Re: counting consecutive strings
                    800282
                    Igor_Pavlove wrote:
                    input: the/at dog/nn sat/v on/in the/at porch/nn.

                    output:
                    at-2
                    nn-2
                    v-1
                    in-1
                    Okay, like I said: use a Map for this.
                    public class Test {
                        
                        private static Map<String, Integer> getMap(String input) {
                            Map<String, Integer> map = new TreeMap<String, Integer>();
                            for(String s: input.split("\\s+")) {
                                s = s.split("/")[1]; // handle a possible AIOOBE
                                Integer oldValue = map.remove(s);
                                map.put(s, oldValue == null ? 1 : oldValue+1);
                            }
                            return map;
                        }
                        
                        public static void main(String[] arg) {
                            String input = "the/at dog/nn sat/v on/in the/at porch/nn";
                            Map<String, Integer> map = getMap(input);
                            System.out.println("map = "+map);
                        }
                    }
                    • 7. Re: counting consecutive strings
                      843785
                      Aight, thanks a lot!
                      • 8. Re: counting consecutive strings
                        843785
                        Sorry for the questions keep coming up, but what if I'd like to put the stuff into one method instead of two? I think, in my case, it's better to solve the whole stuff locally, and not call the particular method in the main method.

                        So I've tried editing it a bit, but eclipse comes up with an error I don't understand. What I've got at the moment:
                        for (int i = 0; i < array.length; i++){
                                              jenna = array;
                                  
                                  Map <String, Integer> map = new TreeMap<String, Integer>();
                                  if (jenna.contains("/")){
                                            String [] anotherArray = jenna.split("\\/");
                                            String jameson = anotherArray[1];
                                            Integer oldValue = map.remove(jameson);
                                            map.put(jameson, oldValue == null ? 1 : oldValue+1);
                                            
                                            Map<String, Integer> mapped = map(jameson);// here it says: The method map(String) is undefined for the type myLittleProject
                                  
                                            textArea.append("map = " + map + newline);
                                  
                                  
                                  }
                                  
                             }
                        Wht is the method map(String) undefined for my class? And can I solve this in some way?                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
                        • 9. Re: counting consecutive strings
                          800282
                          Igor_Pavlove wrote:
                          Sorry for the questions keep coming up, but what if I'd like to put the stuff into one method instead of two? I think, in my case, it's better to solve the whole stuff locally, and not call the particular method in the main method.
                          Not sure what you mean exactly, but one method shoud just be repsonsible for one particular task. It shouldn't do two things.
                          So I've tried editing it a bit, but eclipse comes up with an error I don't understand. What I've got at the moment:
                          for (int i = 0; i < array.length; i++){
                                    jenna = array;
                                    
                                    Map <String, Integer> map = new TreeMap<String, Integer>();
                                    if (jenna.contains("/")){
                                              String [] anotherArray = jenna.split("\\/");
                          You don't need to escape the "/".
                                                       String jameson = anotherArray[1];
                                                      Integer oldValue = map.remove(jameson);
                                                      map.put(jameson, oldValue == null ? 1 : oldValue+1);
                                                      
                                                      Map<String, Integer> mapped = map(jameson);// here it says: The method map(String) is undefined for the type myLittleProject
                               Can you post this map(...) method as well?
                          • 10. Re: counting consecutive strings
                            843785
                            I'm trying to figure out exactly which part does what, so far I've understood that
                            Map <String, Integer> map = new TreeMap<String, Integer>(); // this creates the Map named "map"
                                                  if (jenna.contains("/")){ // check if the input string contains "/"
                                                        String [] anotherArray = jenna.split("/"); // split it on "/"
                                                        String jameson = anotherArray[1]; // take the second part of the array, could be rewritten as String a = jenna.split("/")[1] if I'm right
                                                        Integer oldValue = map.remove(jameson); // creates the int OldValue and removes the mapping for this key from the map if it is present
                                                        map.put(jameson, oldValue == null ? 1 : oldValue+1); // puts the string jameson and the count in the map, if count is zero, make it one, else, add 1 to the counting
                                                        
                                                        Map<String, Integer> mapped = map(jameson); // this gets the count for a particular string (jameson)
                                                  
                                                        textArea.append("map = " + mapped + newline); // printing
                            Am I correct?

                            And the thing is, there is no map(....) method. (which makes it likely that it's not defined...) But why do I need to do that in a separate method? I'm quite confused with that.

                            And thanks anyway so far man, you're really helping me out.
                            • 11. Re: counting consecutive strings
                              843785
                              i.e., I thought the map(input) in the line
                              Map<String, Integer> mad = map(jameson);
                              would work kind of like a map.get(jameson) function, but messing around with the .get command a little, obviously it doesn't. But that's why I thought it didn't need a separate method...