14 Replies Latest reply: Sep 3, 2007 2:00 AM by 791266 RSS

    Help with replaceAll method in String

    807605
      Hi,

      I want to replace all occurences of a substring in a String.
      But, if the substring is surrounded by double-quotes, I want to keep the substring.
      Example:
      the actual String: testing<SEP>testing2<SEP>"testing3<SEP>"<SEP>testing4

      here I want to replace all occurences of <SEP> except for the one occurence found in the "testing3<SEP>" substring.

      Additionally, I want to be able to use the exact phrase (in the example - <SEP>)
      as the first argument in the replaceAll-method. Some characters in the substring might be 'special characters' - so I need to specify 'match exactly the substring' too.

      Help much appreciated! (I can't figure it out despite extensive reading on regexps)
        • 1. Re: Help with replaceAll method in String
          abillconsl
          String antiSeptic = "testing<SEP>testing2<SEP>\"testing3<SEP>\"<SEP>testing4";
          String interSept = antiSeptic.replaceAll("SEP>\"", "SEEP>\"");
          String exSept = interSept.replaceAll("<SEP>", "");
          String septic = exSept.replaceAll("SEEP", "SEP");
          System.out.println(septic);
          untested.

          Nah - misses <SEP> at beginning ' " '. O-L
          • 2. Re: Help with replaceAll method in String
            807605
            Try this:
            import java.util.regex.*;
             
            public class Test 
            {
              static String replaceAllUnquoted(String str, String token)
              {
                String regex = "\"[^\"]*\"|(\\Q" + token + "\\E)";
            
                StringBuffer sb = new StringBuffer();
                Pattern p = Pattern.compile(regex);
                Matcher m = p.matcher(str);
                while (m.find())
                {
                  if (m.start(1) != -1)
                  {
                    m.appendReplacement(sb, "");
                  }
                }
                m.appendTail(sb);
                return sb.toString();
              }
              
              public static void main(String... args)
              {
                String str = "testing<SEP>testing2<SEP>\"testing3<SEP>\"<SEP>testing4";
                System.out.println(str);
                System.out.println(replaceAllUnquoted(str, "<SEP>"));
              }
            }
            • 3. Re: Help with replaceAll method in String
              791266
              String regex = "\"[^\"]*\"|(\\Q" + token + "\\E)";
              I did only do a quick google search but couldn't find any good resources that explained quotation. What does quotation do?
              • 4. Re: Help with replaceAll method in String
                807605
                Or:
                package regexTests;
                
                public class IgnoreQuotes {
                     public static void main (String args[]) {
                          String test = "\"testing<SEP>\"testing2<SEP>\"testing3<SEP>\"<SEP>testing4testing<SEP>testing2<SEP>\"testing3<SEP>\"<SEP>testing4";
                          
                          String array[] = test.split("\"");
                          
                          String result = "";
                          for (int i = 0; i<array.length; i=i+2) {
                               array[i] = array.replaceAll("<SEP>", "");
                          }
                          for (int i = 0; i<array.length; i++) {
                               if (i%2==0) {
                                    result=result+array[i];
                               }
                               else {
                                    result=result+"\""+array[i]+"\"";
                               }
                          }
                          System.out.println(result);
                     }
                }

                Matthew
                • 5. Re: Help with replaceAll method in String
                  807605
                  Many, many thanx!

                  Hope you have a great weekend, mate!
                  • 6. Re: Help with replaceAll method in String
                    807605
                    Uncle and mdares, both!
                    • 7. Re: Help with replaceAll method in String
                      807605
                      What does quotation do?
                      It removes the special meaning from regex metacharacters, saving you the trouble of iterating through the string and adding a backslash in front of each one.
                      • 8. Re: Help with replaceAll method in String
                        807605
                        Uncle_Alice's is much prettier than mine, and would outpreform - use his.
                        • 9. Re: Help with replaceAll method in String
                          abillconsl
                          Not that you need it now at this point, but here's another option just for fun:
                          import java.util.regex.*;
                           
                          public class AntiSeptic {
                            public static void main(String[] argv) {
                              String       septic     = EasyRead.readString("Enter String to search: "), //This is just a class I use to read from the console is all.
                                           antiSeptic = "";
                              String       regex      = "\"\\w*<SEP>\\w*\"";
                              StringBuffer buf        = new StringBuffer();
                              Pattern      pat        = Pattern.compile(regex);
                              Matcher      matcher    = pat.matcher(septic);
                              int          beg = 0, end = 0, prv = 0, len = septic.length();
                              while (matcher.find()) {
                                beg = matcher.start();
                                end = matcher.end();
                                if ( beg > 0  &&  beg < end )
                                  buf.append(septic.substring(prv, beg).replaceAll("<SEP>",""));
                                buf.append(septic.substring(beg, end));
                                prv = end;
                              }
                              if ( end > 0  &&  end < len)
                                buf.append(septic.substring(end, len).replaceAll("<SEP>",""));
                              antiSeptic = (buf.length() > 0) ? buf.toString() : septic;
                              System.out.println("Beginning String: ["+septic+"]");
                              System.out.println("Ending String   : ["+antiSeptic+"]");
                              if ( septic.equalsIgnoreCase(antiSeptic) )
                                System.out.println("No replacements made");
                            }
                          }
                          • 10. Re: Help with replaceAll method in String
                            807605
                            I think we need to back up a bit. As I read it, the requirement is to take a delimited string and replace any delimiters that aren't inside quotes. My solution just removes the delimiters, but after re-reading the original question, I see that isn't what was asked for. Also, the OP said he doesn't want to have to deal with special characters, which is why I quoted the delimiter with \Q and \E. (Try the other solutions with separators "(SEP)" or "{SEP}" to see what I mean.)

                            But the replacement string has special-character issues, too. That's because replaceAll() looks for group references like "$1" in the replacement string and substitutes them with whatever was matched by the corresponding groups in the regex. Also, as with the regex, backslashes are used for escaping, so if there are any dollar signs or backslashes in the replacement string you'll get unexpected results, if not errors. Adapting my original solution to allow for a "safe" replacement delimiter is trivial, if somewhat opaque:
                            import java.util.regex.*;
                             
                            public class Test 
                            {
                              static String replaceAllUnquoted(String str, String delim, String newDelim)
                              {
                                String regex = "\"[^\"]*\"|(\\Q" + delim + "\\E)";
                            
                                StringBuffer sb = new StringBuffer();
                                Pattern p = Pattern.compile(regex);
                                Matcher m = p.matcher(str);
                                while (m.find())
                                {
                                  if (m.start(1) != -1)
                                  {
                                    m.appendReplacement(sb, "");
                                    sb.append(newDelim);
                                  }
                                }
                                m.appendTail(sb);
                                return sb.toString();
                              }
                              
                              public static void main(String... args)
                              {
                                String str = "testing(POE)testing2(POE)\"testing3(POE)\"(POE)testing4\"test(POE)5\"";
                                System.out.println(str);
                                System.out.println();
                                System.out.println(replaceAllUnquoted(str, "(POE)", "$OPE$"));
                              }
                            }
                            The trick here lies in knowing that appendReplacement() does two things: it appends everything between the last match and the current one, then it processes the replacement string as described above and appends the result. Passing it an empty string prevents it from carrying out the second task. Then we append the new delimiter to the StringBuffer ourselves, bypassing the special processing it would otherwise have received.

                            If you use replaceAll(), as the other solutions here do, that trick isn't available; you just have to pre-process the strings to disable special-character processing. Since JDK 1.5, you can use the methods Pattern.quote() and Matcher.quoteReplacement() for that. But if you are running Java 5+, you can use the new replace(CharSequence, CharSequence) method instead of replaceAll(), and not have to worry about any of this special character stuff.
                            • 11. Re: Help with replaceAll method in String
                              abillconsl
                              That is quite interesting. I had the impression thought that the <SEP> was a hard and fast thing.
                              • 12. Re: Help with replaceAll method in String
                                807605
                                Additionally, I want to be able to use the exact phrase
                                (in the example - <SEP>) as the first argument in the
                                replaceAll-method. Some characters in the substring might
                                be 'special characters' - so I need to specify 'match
                                exactly the substring' too.
                                It follows that special characters should be disabled in the replacement string, too, but most people don't realize at first that there are any.

                                I just remembered another way to disable regex metacharacters in the search string:
                                  Pattern p = Pattern.compile(delim, Pattern.LITERAL);
                                This technique also requires JDK 1.5+.
                                • 13. Re: Help with replaceAll method in String
                                  abillconsl
                                  You are so far ahead of me there that you lost me. I did not read all that into the post and still don't, but I am not going to argue with you, since it's clear - and has been clear for a long time - that you are the Regex man ;o)
                                  • 14. Re: Help with replaceAll method in String
                                    791266
                                    What does quotation do?
                                    It removes the special meaning from regex
                                    metacharacters, saving you the trouble of iterating
                                    through the string and adding a backslash in front of
                                    each one.
                                    Thanks. I thought that it would do something like that, but wasn't sure.