10 Replies Latest reply: Jan 22, 2009 7:43 PM by 807588 RSS

    Regular Expression issue

    807588
      Hi all,

      I'm doing some debugging to a portion of code that i didn't do, to an application that sometimes crash with an OutOfMemoryError.

      I think i have already found what's causing it. During the execution of a method it uses this regex
      regex = "(\\{|\\}|\\[|\\]|\\~|\\||\\\\|\\?);";
      which doesn't do what it was supposed to do. So I've replaced the regex for the correct one
      regex = "\\{|\\}|\\[|\\]|\\~|\\||\\\\|\\?";
      and tested the method, and bang... OutOfMemoryError. With this i just need to know an example string that should match positive with the first regex, like this:
      int found = 0;
      String regex = "(\\{|\\}|\\[|\\]|\\~|\\||\\\\|\\?);";
      Pattern p = Pattern.compile(regex);
      Matcher m  = p.matcher(message);
      
      //found should be != 0
      while(m.find())     { found++; }
      Many thanks in advance!
      Andre
        • 1. Re: Regular Expression issue
          796440
          First, what exactly are you trying to do with that regex?
          • 2. Re: Regular Expression issue
            796440
            If my eyes do not deceive me,
            "(\\{|\\}|\\[|\\]|\\~|\\||\\\\|\\?);"
            can be replaced by
            "[\\]\\[{}~|\\\\?];"
            • 3. Re: Regular Expression issue
              807588
              I didn't made this regex, but it's intent is to match any occurrence of one of the following characters: { } [ ] ~ | \ ? but it doesn't match any of those.

              The regex
              "(\\{|\\}|\\[|\\]|\\~|\\||\\\\|\\?);"
              is defined in a properties file, so my guess is that who did it, probably have copy&paste not only the regex but the also the surrounding (...); and the property have become
              //Properties file app.properties
              regex=(\\{|\\}|\\[|\\]|\\~|\\||\\\\|\\?);
              Now i know that the regex is incorrect and it does not do it purpose, but when it matches something it causes a OutOfMemoryError, due to horrible cycle that it is coded. I already have the solution to this problem, but i need to prove that the regex was causing the problem, by making it to match something and i have no idea of what could possibly match the regex.
              • 4. Re: Regular Expression issue
                796440
                alopes wrote:
                Now i know that the regex is incorrect and it does not do it purpose, but when it matches something it causes a OutOfMemoryError, due to horrible cycle that it is coded.
                The regex I provided will match any of "{}[]~|?\" followed by semicolon. So for instance
                "[;".matches(regex)
                will be true.

                That regex should NOT be causing OOME.
                • 5. Re: Regular Expression issue
                  807588
                  No the regex is not causing the OOME directly, but in the code
                  int found = 0;
                  String regex = "(\\{|\\}|\\[|\\]|\\~|\\||\\\\|\\?);";
                  Pattern p = Pattern.compile(regex);
                  Matcher m  = p.matcher(message);
                   
                  //found should be != 0
                  while(m.find())     { found++; }
                  when the value of the variable found is greater than 0, a cycle loops indefinitely and on each loop it adds a string to a ArratList, that causes de OOME :)

                  I wil now test your example :)
                  • 6. Re: Regular Expression issue
                    807588
                    You were correct,
                    "];".matches(regex) = true
                    Many thanks for your help!
                    • 7. Re: Regular Expression issue
                      796440
                      So what are you actually trying to accomplish? Determine how many "[], etc." chars are in the string, and then add something to a list that many times?

                      Is the problem that your regex code is giving you the wrong count?

                      Or is it giving you the right count but then the list building code is blowing up?

                      You'll have much better luck getting help if you clarify what problem you're trying to solve and what the exact difficulties are that you're encountering.
                      • 8. Re: Regular Expression issue
                        807588
                        jverd wrote:
                        If my eyes do not deceive me,
                        "(\\{|\\}|\\[|\\]|\\~|\\||\\\\|\\?);"
                        can be replaced by
                        "[\\]\\[{}~|\\\\?];"
                        :crossedeyes:

                        When quoting, I like to use \Q...\E, which is encapsulated in Pattern.quote:
                        import java.util.regex.*;
                        
                        public class Example {
                            public static void main(String[] args) {
                                String list = "{}[]~|?\\";
                                String regex = "["+ Pattern.quote(list) + "];";
                        
                                //hits
                                test(list, regex);
                                
                                System.out.println();
                                
                                //misses
                                test("a2.()!<>", regex);
                            }
                        
                            static void test(String list, String regex) {
                                for(char c : list.toCharArray()) {
                                    String s  = c + ";";
                                    System.out.format("\"%s\".matches(regex)=%b%n", s, s.matches(regex));
                                }
                            }
                        }
                        • 9. Re: Regular Expression issue
                          796440
                          BigDaddyLoveHandles wrote:
                          When quoting, I like to use \Q...\E, which is encapsulated in Pattern.quote:
                          True, that'd make it a bit more readable.
                          • 10. Re: Regular Expression issue
                            807588
                            jverd wrote:
                            So what are you actually trying to accomplish? Determine how many "[], etc." chars are in the string, and then add something to a list that many times?
                            yes, it is to be used to count extended characters on a SMS
                            jverd wrote:
                            Is the problem that your regex code is giving you the wrong count?
                            The wrong regex was giving the wrong count. Almost of the times it didn't match any thing.

                            jverd wrote:
                            Or is it giving you the right count but then the list building code is blowing up?

                            You'll have much better luck getting help if you clarify what problem you're trying to solve and what the exact difficulties are that you're encountering.
                            Thanks for the interest on the problem, but the only problem i had was understanding the regex, so that i can reproduce the OOME on the application, so that my boss believe me. The correction to the OOME has been already done and tested, and it's working just fine!

                            Once again, many thanks, and God,Ala,etc save java and the SUN's forum :)