This discussion is archived
14 Replies Latest reply: Aug 31, 2007 9:29 AM by 807605 RSS

Help with replaceAll method in String

807605 Newbie
Currently Being Moderated
Hi,

I want to replace all occurences of a substring in a String.
But, if the substring is surrounded by double-quotes, I want to keep the substring.
Example:
the actual String: testing<SEP>testing2<SEP>"testing3<SEP>"<SEP>testing4

here I want to replace all occurences of <SEP> except for the one occurence found in the "testing3<SEP>" substring.

Additionally, I want to be able to use the exact phrase (in the example - <SEP>)
as the first argument in the replaceAll-method. Some characters in the substring might be 'special characters' - so I need to specify 'match exactly the substring' too.

Help much appreciated! (I can't figure it out despite extensive reading on regexps)
  • 1. Re: Help with replaceAll method in String
    abillconsl Explorer
    Currently Being Moderated
    String antiSeptic = "testing<SEP>testing2<SEP>\"testing3<SEP>\"<SEP>testing4";
    String interSept = antiSeptic.replaceAll("SEP>\"", "SEEP>\"");
    String exSept = interSept.replaceAll("<SEP>", "");
    String septic = exSept.replaceAll("SEEP", "SEP");
    System.out.println(septic);
    untested.

    Nah - misses <SEP> at beginning ' " '. O-L
  • 2. Re: Help with replaceAll method in String
    807605 Newbie
    Currently Being Moderated
    Try this:
    import java.util.regex.*;
     
    public class Test 
    {
      static String replaceAllUnquoted(String str, String token)
      {
        String regex = "\"[^\"]*\"|(\\Q" + token + "\\E)";
    
        StringBuffer sb = new StringBuffer();
        Pattern p = Pattern.compile(regex);
        Matcher m = p.matcher(str);
        while (m.find())
        {
          if (m.start(1) != -1)
          {
            m.appendReplacement(sb, "");
          }
        }
        m.appendTail(sb);
        return sb.toString();
      }
      
      public static void main(String... args)
      {
        String str = "testing<SEP>testing2<SEP>\"testing3<SEP>\"<SEP>testing4";
        System.out.println(str);
        System.out.println(replaceAllUnquoted(str, "<SEP>"));
      }
    }
  • 3. Re: Help with replaceAll method in String
    791266 Explorer
    Currently Being Moderated
    String regex = "\"[^\"]*\"|(\\Q" + token + "\\E)";
    I did only do a quick google search but couldn't find any good resources that explained quotation. What does quotation do?
  • 4. Re: Help with replaceAll method in String
    807605 Newbie
    Currently Being Moderated
    Or:
    package regexTests;
    
    public class IgnoreQuotes {
         public static void main (String args[]) {
              String test = "\"testing<SEP>\"testing2<SEP>\"testing3<SEP>\"<SEP>testing4testing<SEP>testing2<SEP>\"testing3<SEP>\"<SEP>testing4";
              
              String array[] = test.split("\"");
              
              String result = "";
              for (int i = 0; i<array.length; i=i+2) {
                   array[i] = array.replaceAll("<SEP>", "");
              }
              for (int i = 0; i<array.length; i++) {
                   if (i%2==0) {
                        result=result+array[i];
                   }
                   else {
                        result=result+"\""+array[i]+"\"";
                   }
              }
              System.out.println(result);
         }
    }

    Matthew
  • 5. Re: Help with replaceAll method in String
    807605 Newbie
    Currently Being Moderated
    Many, many thanx!

    Hope you have a great weekend, mate!
  • 6. Re: Help with replaceAll method in String
    807605 Newbie
    Currently Being Moderated
    Uncle and mdares, both!
  • 7. Re: Help with replaceAll method in String
    807605 Newbie
    Currently Being Moderated
    What does quotation do?
    It removes the special meaning from regex metacharacters, saving you the trouble of iterating through the string and adding a backslash in front of each one.
  • 8. Re: Help with replaceAll method in String
    807605 Newbie
    Currently Being Moderated
    Uncle_Alice's is much prettier than mine, and would outpreform - use his.
  • 9. Re: Help with replaceAll method in String
    abillconsl Explorer
    Currently Being Moderated
    Not that you need it now at this point, but here's another option just for fun:
    import java.util.regex.*;
     
    public class AntiSeptic {
      public static void main(String[] argv) {
        String       septic     = EasyRead.readString("Enter String to search: "), //This is just a class I use to read from the console is all.
                     antiSeptic = "";
        String       regex      = "\"\\w*<SEP>\\w*\"";
        StringBuffer buf        = new StringBuffer();
        Pattern      pat        = Pattern.compile(regex);
        Matcher      matcher    = pat.matcher(septic);
        int          beg = 0, end = 0, prv = 0, len = septic.length();
        while (matcher.find()) {
          beg = matcher.start();
          end = matcher.end();
          if ( beg > 0  &&  beg < end )
            buf.append(septic.substring(prv, beg).replaceAll("<SEP>",""));
          buf.append(septic.substring(beg, end));
          prv = end;
        }
        if ( end > 0  &&  end < len)
          buf.append(septic.substring(end, len).replaceAll("<SEP>",""));
        antiSeptic = (buf.length() > 0) ? buf.toString() : septic;
        System.out.println("Beginning String: ["+septic+"]");
        System.out.println("Ending String   : ["+antiSeptic+"]");
        if ( septic.equalsIgnoreCase(antiSeptic) )
          System.out.println("No replacements made");
      }
    }
  • 10. Re: Help with replaceAll method in String
    807605 Newbie
    Currently Being Moderated
    I think we need to back up a bit. As I read it, the requirement is to take a delimited string and replace any delimiters that aren't inside quotes. My solution just removes the delimiters, but after re-reading the original question, I see that isn't what was asked for. Also, the OP said he doesn't want to have to deal with special characters, which is why I quoted the delimiter with \Q and \E. (Try the other solutions with separators "(SEP)" or "{SEP}" to see what I mean.)

    But the replacement string has special-character issues, too. That's because replaceAll() looks for group references like "$1" in the replacement string and substitutes them with whatever was matched by the corresponding groups in the regex. Also, as with the regex, backslashes are used for escaping, so if there are any dollar signs or backslashes in the replacement string you'll get unexpected results, if not errors. Adapting my original solution to allow for a "safe" replacement delimiter is trivial, if somewhat opaque:
    import java.util.regex.*;
     
    public class Test 
    {
      static String replaceAllUnquoted(String str, String delim, String newDelim)
      {
        String regex = "\"[^\"]*\"|(\\Q" + delim + "\\E)";
    
        StringBuffer sb = new StringBuffer();
        Pattern p = Pattern.compile(regex);
        Matcher m = p.matcher(str);
        while (m.find())
        {
          if (m.start(1) != -1)
          {
            m.appendReplacement(sb, "");
            sb.append(newDelim);
          }
        }
        m.appendTail(sb);
        return sb.toString();
      }
      
      public static void main(String... args)
      {
        String str = "testing(POE)testing2(POE)\"testing3(POE)\"(POE)testing4\"test(POE)5\"";
        System.out.println(str);
        System.out.println();
        System.out.println(replaceAllUnquoted(str, "(POE)", "$OPE$"));
      }
    }
    The trick here lies in knowing that appendReplacement() does two things: it appends everything between the last match and the current one, then it processes the replacement string as described above and appends the result. Passing it an empty string prevents it from carrying out the second task. Then we append the new delimiter to the StringBuffer ourselves, bypassing the special processing it would otherwise have received.

    If you use replaceAll(), as the other solutions here do, that trick isn't available; you just have to pre-process the strings to disable special-character processing. Since JDK 1.5, you can use the methods Pattern.quote() and Matcher.quoteReplacement() for that. But if you are running Java 5+, you can use the new replace(CharSequence, CharSequence) method instead of replaceAll(), and not have to worry about any of this special character stuff.
  • 11. Re: Help with replaceAll method in String
    abillconsl Explorer
    Currently Being Moderated
    That is quite interesting. I had the impression thought that the <SEP> was a hard and fast thing.
  • 12. Re: Help with replaceAll method in String
    807605 Newbie
    Currently Being Moderated
    Additionally, I want to be able to use the exact phrase
    (in the example - <SEP>) as the first argument in the
    replaceAll-method. Some characters in the substring might
    be 'special characters' - so I need to specify 'match
    exactly the substring' too.
    It follows that special characters should be disabled in the replacement string, too, but most people don't realize at first that there are any.

    I just remembered another way to disable regex metacharacters in the search string:
      Pattern p = Pattern.compile(delim, Pattern.LITERAL);
    This technique also requires JDK 1.5+.
  • 13. Re: Help with replaceAll method in String
    abillconsl Explorer
    Currently Being Moderated
    You are so far ahead of me there that you lost me. I did not read all that into the post and still don't, but I am not going to argue with you, since it's clear - and has been clear for a long time - that you are the Regex man ;o)
  • 14. Re: Help with replaceAll method in String
    791266 Explorer
    Currently Being Moderated
    What does quotation do?
    It removes the special meaning from regex
    metacharacters, saving you the trouble of iterating
    through the string and adding a backslash in front of
    each one.
    Thanks. I thought that it would do something like that, but wasn't sure.