1 2 Previous Next 27 Replies Latest reply: Sep 19, 2007 11:07 AM by 796440 RSS

    Regex Issue

    807605
      Does regex optimization has any effect on memory ??
      For ex
      If i use \d+ instate of [0-9]+ then is it having any effect on memory ??
        • 1. Re: Regex Issue
          800282
          ...
          If i use \d+ instate of [0-9]+ then is it having any effect on memory ??
          No.
          • 2. Re: Regex Issue
            800649
            I don't think the example you mentioned has an effect on the memory usage.
            It may be faster..

            anyway, use a profiler to check the memory usage of your java applications:

            http://java-source.net/open-source/profilers
            http://www.manageability.org/blog/stuff/open-source-profilers-for-java
            • 3. Re: Regex Issue
              807605
              Yes.
              You need at least 3 bytes less for storing \d+ then [0-9]+
              • 4. Re: Regex Issue
                807605
                Michael.Nazarov@sun.com wrote:
                Yes.
                You need at least 3 bytes less for storing \d+ then [0-9]+
                I am asking while processing for Matcher which one is heavier and which is light .
                • 5. Re: Regex Issue
                  807605
                  TIAS

                  [Design a test harness that will tell you.]
                  • 6. Re: Regex Issue
                    800282
                    I am asking while processing for Matcher which one is heavier and which is light .
                    import java.util.regex.*;
                    
                    class Foo {
                        
                        private static void sillyTest(String regex, int n) {
                            Matcher m = Pattern.compile(regex).
                                    matcher("A 12 B 34 C 56 A 12 B 34 C 56 A 12 B 34 C 56");
                            while(n-- > 0) while(m.find());
                        }
                        
                        public static void main(String[] args) {
                            String regex;
                            long start, end;
                            int n = 10000000;
                            
                            sillyTest("foo", n); // warm up run
                            
                            regex = "[0-9]+";
                            start = System.currentTimeMillis();
                            sillyTest(regex, n);
                            end = System.currentTimeMillis();
                            System.out.println(regex+" took ~"+(end-start)+" ms.");
                            
                            regex = "\\d+";
                            start = System.currentTimeMillis();
                            sillyTest(regex, n);
                            end = System.currentTimeMillis();
                            System.out.println(regex+" took ~"+(end-start)+" ms."); 
                        }
                    }
                    No significant difference.
                    • 7. Re: Regex Issue
                      807605
                      Thanks for this brief explanation
                      • 8. Re: Regex Issue
                        800282
                        Thanks for this brief explanation
                        You're welcome.
                        • 9. Re: Regex Issue
                          807605
                          As brief as incorrect :)

                          C:\Work\o>java Foo
                          ([0-9]+[A-Z]+[0-9]{1,6})|([0-9]{1,2}AB*[0-9]?) took ~1859 ms.
                          \d+ took ~1859 ms.

                          Let me guess -- ANY expression requires same time.
                          Someone agree? Why? :)
                          • 10. Re: Regex Issue
                            791266
                            Michael.Nazarov@sun.com wrote:
                            As brief as incorrect :)

                            C:\Work\o>java Foo
                            ([0-9]+[A-Z]+[0-9]{1,6})|([0-9]{1,2}AB*[0-9]?) took ~1859 ms.
                            \d+ took ~1859 ms.

                            Let me guess -- ANY expression requires same time.
                            No
                            Someone agree?
                            I don't
                            Why? :)
                            Regexp1: A*1
                            Regexp2: \d+
                            Gives

                            A*1 took ~3594 ms.
                            \d+ took ~313 ms.


                            Kaj

                            Ps. Crappy forum software, it alters my post

                            Edited by: kajbj on Sep 19, 2007 4:56 PM
                            • 11. Re: Regex Issue
                              800282
                              As brief as incorrect :)
                              I'm not sure if I follow you. I agree that my test is not a very precise one, but (AFAIK) it does show that *[0-9]* and *\d* are performance wise practically the same.
                              This is what I get when running it a couple of times:
                              [0-9]+ took ~813 ms.
                              \d+ took ~796 ms.
                              
                              [0-9]+ took ~812 ms.
                              \d+ took ~797 ms.
                              
                              [0-9]+ took ~797 ms.
                              \d+ took ~781 ms.
                              • 12. Re: Regex Issue
                                807605
                                "A*1" really takes long time but two expressions provided by me works in same time.
                                • 13. Re: Regex Issue
                                  807605
                                  Your test also shows same time for two completely different expressions.
                                  • 14. Re: Regex Issue
                                    800282
                                    ... but two expressions provided by me works in same time.
                                    That was just a coincidence then.
                                    ([0-9]+[A-Z]+[0-9]{1,6})|([0-9]{1,2}AB*[0-9]?) took ~813 ms.
                                    \d+ took ~797 ms.
                                    
                                    ([0-9]+[A-Z]+[0-9]{1,6})|([0-9]{1,2}AB*[0-9]?) took ~859 ms.
                                    \d+ took ~844 ms.
                                    
                                    ([0-9]+[A-Z]+[0-9]{1,6})|([0-9]{1,2}AB*[0-9]?) took ~797 ms.
                                    \d+ took ~781 ms.
                                    1 2 Previous Next