1 2 Previous Next 15 Replies Latest reply: Sep 25, 2009 5:46 PM by jschellSomeoneStoleMyAlias RSS

    RegEx String Negation

    807580
      Hello,

      I'm trying to find a regular expression that will match any file URI in the WEB-INF/lib directory and the jars within it except a couple of specific jars, let's say a.jar and b.jar.

      For instance, I'd like the expression to match WEB-INF/lib/c.jar/myClass.class and WEB-INF/lib/d.jar/myClass.class but not any file inside and including the WEB-INF/lib/a.jar and WEB-INF/lib/b.jar

      From what I understand so far, it would be ideal if I could somehow group a literal string "a.jar" and "b.jar" into two distinct entities, E1 and E2 and use negation brackets to something of the effect WEB-INF/lib/[^E1E2].* which would match everything that begins with WEB-INF/lib except those strings directly followed by E1 and E2, a.jar and b.jar respectively.

      Is there a way to do this?
        • 1. Re: RegEx String Negation
          jschellSomeoneStoleMyAlias
          A directory which you are going to have to span regardless.

          And you have a list of files that you don't want to match?

          So...
          1. Create a match for those files. (positive not negative)
          2. Loop (span) and for each file is the match fails then you have a valid file.
          • 2. Re: RegEx String Negation
            807580
            I don't understand. What do you mean be WEB-INF/lib/c.jar/myClass.class ? Do you mean you want to look inside c.jar for the class myClass ? If so then regex is not going to help you here.
            • 3. Re: RegEx String Negation
              807580
              jschell wrote:
              So...
              1. Create a match for those files. (positive not negative)
              2. Loop (span) and for each file is the match fails then you have a valid file.
              There is no need for this since one can use 'negative lookahead' to exclude files from the match.
              • 4. Re: RegEx String Negation
                jschellSomeoneStoleMyAlias
                sabre150 wrote:
                jschell wrote:
                So...
                1. Create a match for those files. (positive not negative)
                2. Loop (span) and for each file is the match fails then you have a valid file.
                There is no need for this since one can use 'negative lookahead' to exclude files from the match.
                What Java API takes a regex and returns a list of files?
                • 5. Re: RegEx String Negation
                  807580
                  You can use FileNameFilter for that.
                  • 6. Re: RegEx String Negation
                    807580
                    jschell wrote:
                    sabre150 wrote:
                    jschell wrote:
                    So...
                    1. Create a match for those files. (positive not negative)
                    2. Loop (span) and for each file is the match fails then you have a valid file.
                    There is no need for this since one can use 'negative lookahead' to exclude files from the match.
                    What Java API takes a regex and returns a list of files?
                            FileFilter filter = new FileFilter()
                            {
                                public boolean accept(File pathname)
                                {
                                    String name = pathname.getName();
                                    return name.matches("(?!(a\\.jar|b\\.jar)$).*");
                                }
                            };
                    
                            File[] children = new File("/home/sabre").listFiles(filter);
                            for (File child : children)
                            {
                                System.out.println(child);
                            }
                    • 7. Re: RegEx String Negation
                      807580
                      sabre150 wrote:
                      jschell wrote:
                      sabre150 wrote:
                      jschell wrote:
                      So...
                      1. Create a match for those files. (positive not negative)
                      2. Loop (span) and for each file is the match fails then you have a valid file.
                      There is no need for this since one can use 'negative lookahead' to exclude files from the match.
                      What Java API takes a regex and returns a list of files?
                      FileFilter filter = new FileFilter()
                      {
                      public boolean accept(File pathname)
                      {
                      String name = pathname.getName();
                      return name.matches("(?!(a\\.jar|b\\.jar)$).*");
                      }
                      };
                      
                      File[] children = new File("/home/sabre").listFiles(filter);
                      for (File child : children)
                      {
                      System.out.println(child);
                      }
                      Exactly this, except I'm doing some parsing and jar extraction under the covers. It seems that the negative lookahead method worked sabre150. I used the regex:
                      myFileString.matches("WEB-INF/lib/(?!a.jar)(?!b.jar).*")
                      And this seems to cover exactly what I was trying to do. If there's a non-obvious [to a beginner] boundary case that I might have missed during testing, please let me know. And thanks for the quick, numerous, and valid responses!
                      • 8. Re: RegEx String Negation
                        807580
                        kmjansen wrote:
                        If there's a non-obvious [to a beginner] boundary case that I might have missed during testing, please let me know.
                        Two boundary cases that are boundary cases. You need to make sure that the negative match ends on the filename boundary and the the '.' in the file name in the regex is not treated as any char i.e.
                        myFileString.matches("WEB-INF/lib/(?!a\\.jar$)(?!b\\.jar$).*")
                        though in practice the chance of it mattering is close to zero.

                        Edited by: sabre150 on Sep 24, 2009 9:23 PM
                        • 9. Re: RegEx String Negation
                          jschellSomeoneStoleMyAlias
                          sabre150 wrote:
                          FileFilter filter = new FileFilter()
                          {
                          public boolean accept(File pathname)
                          {
                          String name = pathname.getName();
                          return name.matches("(?!(a\\.jar|b\\.jar)$).*");
                          }
                          };
                          BalusC wrote:You can use FileNameFilter for that.
                          Wrong in two.

                          The fact that you can use the regex class to construct code that is processed in a loop means that what I said is exactly correct.

                          Because you are using the regex class and doing a comparison one file name at a time it means that a positive regex (which you negate) works just as well and probably better than negative look ahead.

                          Now if the java API has a method that takes a regex expression and filters files names then a negative look ahead would probably be ideal.
                          • 10. Re: RegEx String Negation
                            807580
                            jschell wrote:
                            sabre150 wrote:
                            FileFilter filter = new FileFilter()
                            {
                            public boolean accept(File pathname)
                            {
                            String name = pathname.getName();
                            return name.matches("(?!(a\\.jar|b\\.jar)$).*");
                            }
                            };
                            BalusC wrote:You can use FileNameFilter for that.
                            Wrong in two.

                            The fact that you can use the regex class to construct code that is processed in a loop means that what I said is exactly correct.
                            Come off it. Why are you being so defensive and pedantic? I was expressing the view that it could be done in one stage rather than in two. If you think that your procedure outlined in reply #1 results in better code than I posted in reply #6 then post something to illustrate this.

                            >
                            Because you are using the regex class and doing a comparison one file name at a time it means that a positive regex (which you negate) works just as well and probably better than negative look ahead.
                            Again, I was just removing a processing stage.

                            >
                            Now if the java API has a method that takes a regex expression and filters files names then a negative look ahead would probably be ideal.
                            I didn't say the API did have such a class. I don't see the need for RegexFileFilter to be in the API (though I have one in my private library) since as I illustrated it is trivial to implement with the current API.
                            • 11. Re: RegEx String Negation
                              807580
                              Part of the under-the-covers code that I mentioned will return the filenames to me as Strings, thus I don't have the luxury of a File API operations. Also, I think the fact that I only want matches for files in the directory (returned as Strings) WEB-INF/lib/ with an option to negate particular jars was either not explained well by myself or lost meaning in the thread. I.e I might get files like /utility or /utility/myUtility.jar returned and I don't want those to match.

                              So to make sure I'm doing this correctly, I think I understand that I need the negative lookahead in the string I previously posted to filter out those files that aren't in the WEB-INF/lib directory.
                              • 12. Re: RegEx String Negation
                                jschellSomeoneStoleMyAlias
                                sabre150 wrote:
                                The fact that you can use the regex class to construct code that is processed in a loop means that what I said is exactly correct.
                                Come off it. Why are you being so defensive and pedantic? I was expressing the view that it could be done in one stage rather than in two. If you think that your procedure outlined in reply #1 results in better code than I posted in reply #6 then post something to illustrate this.
                                Except that you said "There is no need for this ..."

                                Which reads as though you think that your solution is in fact better in some way than what I said and/or that what I said is incorrect.

                                Your solution is doing nothing but exactly what I said except that you use a negative case where negating a positive one would do exactly the same thing.

                                Modifying your code (to a semi pseudo example)...

                                return ! name.matches("^(a\\.jar|b\\.jar)$");

                                >>
                                Because you are using the regex class and doing a comparison one file name at a time it means that a positive regex (which you negate) works just as well and probably better than negative look ahead.
                                Again, I was just removing a processing stage.
                                Again which suggests that you think your solution is somehow different than what I claimed exclusive of the fact that you are using a negative case rather than negating a positive one.

                                >>
                                Now if the java API has a method that takes a regex expression and filters files names then a negative look ahead would probably be ideal.
                                I didn't say the API did have such a class. I don't see the need for RegexFileFilter to be in the API (though I have one in my private library) since as I illustrated it is trivial to implement with the current API.
                                Just to be clear you do understand that your solution is in fact in a loop right? Each file individually is subjected to a regex comparison. The implementation of the loop itself is irrelevant and specifically not detailed for reason in my response.

                                After that the only difference is
                                1. Your solution uses a regex crafted to look for a negative condition
                                2. My uses a regex crafted to look for a positive case and then negate the result.

                                In the above how do you think that your solution is better, more complete, or substantially different than what I proposed? Do you think that a negative regex is more expressive than negating the result? Or perhaps more efficient?

                                And finally you do understand that if the Java API itself did offer a service like this then your regex would in fact be correct because such an API would very likely assume a positive match? In this case there is no such API and so in crafting such an API one can code it to use a negative match. That is specifically why I asked if such a Java API exists. It doesn't.
                                • 13. Re: RegEx String Negation
                                  jschellSomeoneStoleMyAlias
                                  kmjansen wrote:
                                  Part of the under-the-covers code that I mentioned will return the filenames to me as Strings, thus I don't have the luxury of a File API operations. Also, I think the fact that I only want matches for files in the directory (returned as Strings) WEB-INF/lib/ with an option to negate particular jars was either not explained well by myself or lost meaning in the thread. I.e I might get files like /utility or /utility/myUtility.jar returned and I don't want those to match.
                                  That example makes no sense. You can have a file with a path like "/utility" but you cannot, at the same time, have another file named "/utility/myUtility.jar". The first name as a file precludes the second usage as a directory.

                                  You might want to be sure that the exclusion list is well defined by a regex rather than a file list as well. I suspect that most of the time a file list is going to be a better solution in terms of intent and maintenance.
                                  • 14. Re: RegEx String Negation
                                    807580
                                    jschell wrote:
                                    sabre150 wrote:
                                    The fact that you can use the regex class to construct code that is processed in a loop means that what I said is exactly correct.
                                    Come off it. Why are you being so defensive and pedantic? I was expressing the view that it could be done in one stage rather than in two. If you think that your procedure outlined in reply #1 results in better code than I posted in reply #6 then post something to illustrate this.
                                    Except that you said "There is no need for this ..."
                                    Meaning one does not have to use an explicit loop and accept files only when the name does not match a file to be rejected. One can use a FileFilter using a negative look ahead to do the excluding. As you point out later, one could just have well used a ! operation and saved the negative look ahead.

                                    >
                                    Which reads as though you think that your solution is in fact better in some way than what I said and/or that what I said is incorrect.
                                    I did not say or imply that your solution is incorrect. I said "There is no need for this since one can use 'negative lookahead' to exclude files from the match." with the accent being on the word 'need'. I saw your solution in reply 1 as needing an explicit loop when one is not needed. I still read it that way.

                                    >
                                    Your solution is doing nothing but exactly what I said except that you use a negative case where negating a positive one would do exactly the same thing.
                                    Not unless I am totally misunderstanding what you were saying. I saw then and still do see your solution as requiring one to get a list of the files then explicitly go though the list to create a list of those that are not in the rejected set. Your solution needs an explicit loop. Mine does not require an explicit loop.

                                    >
                                    Modifying your code (to a semi pseudo example)...

                                    return ! name.matches("^(a\\.jar|b\\.jar)$");
                                    Yep. I agree. But that is not what I read into reply 1. You did not mention class FileFilter. You did use the words 'file filter'. You implied an explicit loop which is not needed.

                                    >
                                    >>>
                                    Because you are using the regex class and doing a comparison one file name at a time it means that a positive regex (which you negate) works just as well and probably better than negative look ahead.
                                    Again, I was just removing a processing stage.
                                    Again which suggests that you think your solution is somehow different than what I claimed exclusive of the fact that you are using a negative case rather than negating a positive one.
                                    I just claim that your reply 1 implied an explicit loop which is not needed. If that is not what you meant then I misunderstood.

                                    >
                                    >>>
                                    Now if the java API has a method that takes a regex expression and filters files names then a negative look ahead would probably be ideal.
                                    I didn't say the API did have such a class. I don't see the need for RegexFileFilter to be in the API (though I have one in my private library) since as I illustrated it is trivial to implement with the current API.
                                    Just to be clear you do understand that your solution is in fact in a loop right?
                                    Yes - but the the loop is behind the scenes hidden in the listFiles() method.
                                    Each file individually is subjected to a regex comparison.
                                    Yes.
                                    The implementation of the loop itself is irrelevant and specifically not detailed for reason in my response.
                                    That is all we are arguing about. I think it is very very very relevant since reply 1 implies an explicit loop is required.

                                    >
                                    After that the only difference is
                                    1. Your solution uses a regex crafted to look for a negative condition
                                    2. My uses a regex crafted to look for a positive case and then negate the result.
                                    I agree that the regex bit makes little or no difference but the explicit loop does.

                                    >
                                    In the above how do you think that your solution is better, more complete, or substantially different than what I proposed? Do you think that a negative regex is more expressive than negating the result? Or perhaps more efficient?
                                    I never said that. I don't say that. As I keep saying - my solution does not require an explicit loop though behind the scenes there is a loop.

                                    >
                                    And finally you do understand that if the Java API itself did offer a service like this then your regex would in fact be correct because such an API would very likely assume a positive match?
                                    Which is probably why I thought in terms of negative look ahead because my RegexFileFilter does look for a positive match. I also have a NotFileFilter which chains to another file filter and returns the inverse of result of the chained FileFilter.
                                    In this case there is no such API and so in crafting such an API one can code it to use a negative match. That is specifically why I asked if such a Java API exists. It doesn't.
                                    Either I totally misunderstood reply 1 and you were not meaning that an explicit loop is required or I misunderstand the points you have been trying to make since reply 1. Either way it's not going to get resolved without both of us spending significant time on it and I don't think the rewards are going to be worth the effort. I will let you have the last word if you care to respond to this.
                                    1 2 Previous Next