1 2 Previous Next 15 Replies Latest reply: Dec 24, 2009 11:48 AM by 3004 RSS

    Remove all the special characters using java.util.regex

    843789
      Hi,

      How to remove the all the special characters in a String[] using regex, i have the following:-

      public class RegExpTest {
           private static String removeSplCharactersForNumber(String[] number) {
                String number= null;
                Matcher m = null;
                     Pattern p = Pattern.compile("\\!\\@\\#\\$\\%\\^\\&\\*\\(\\)\\_\\+\\-\\{\\}\\|\\;\\\\\\'////\\,\\.\\?\\<\\>\\[\\]");
                     for (int i = 0; i < number.length; i++) {
                     m = p.matcher(number);
                     if (m.find()) {
                          number= m.replaceAll("");
                     }
                     }
                     System.out.println("Final Number is:::"+number);
                     return number;
                }
                     
                public static void main(String args[]){
                     String[] str = {"raghav!@#$%^&*()_+"};
                     RegExpTest regExpTest = new RegExpTest();
                     regExpTest.removeSplCharactersForNumber(str);
                }
      }

      This code is not working and m.find() is "false", here i want the output to be raghav for the entered string array, not only that it should remove all the special characters for a entered string[]. Is there a simple way to do this to remove all the special characters for a given string[]? More importantly the "spaces" (treated as a spl. character), should be removed as well. Please do provide a solution to this.

      Thanks
        • 1. Re: Remove all the special characters using java.util.regex
          843789
          You don't need the find(). Just use the replaceAll() on each element of the String[] i.e.
          String[] values = ...
          for (int i = 0; i < values.length; i++)
          {
              values[i] = p.matcher(values).replaceAll("");
          }
          I can't understand your regex since the forum software has mangled it but you just need to add a space to the set of chars to remove. When you post code, surround it with CODE tags then the forum software won't mangle it.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
          • 2. Re: Remove all the special characters using java.util.regex
            699554
            AnanthJava wrote:
            remove the all the special characters in a String[] using regex, i have the following:-
            By special characters do you mean non word characters? If so use the elegant regex "\W" which matches non word characters, e.g. characters not in [a-zA-Z_0-9].
            public String removeSpecialCharacters(String s) {
              return s.replaceAll("\W", "");
            }
            Mel
            • 3. Re: Remove all the special characters using java.util.regex
              843789
              Thank you guys for the reply... I have messed up with regex, my problem is simple, i have string[] which may have "n" number of spl. characters like "@#$!%^&*()><>", etc (can have spaces as well). I want to parse the string[] and remove the spl. characters in it. If i have given an input like:-

              "Raghav!@#$% 999_+<>:'./"

              the output should be:

              Raghav999 (without spaces).

              Please do provide the code for this.
              • 4. Re: Remove all the special characters using java.util.regex
                843789
                AnanthJava wrote:
                Please do provide the code for this.
                I thought I had provided 99% of it. You just need to provide the regex which is very simple. Just the set of chars you don't want.
                • 5. Re: Remove all the special characters using java.util.regex
                  3004
                  AnanthJava wrote:
                  Please do provide the code for this.
                  That's not what these forums are for.
                  • 6. Re: Remove all the special characters using java.util.regex
                    843789
                    First, I'd like to say that when you're posting actual java code, use the code formatting, otherwise your code will be all confusing to read, like it is now.. Also, a simple replaceAll() statement should be used. Read about it in the String API.
                    • 7. Re: Remove all the special characters using java.util.regex
                      843789
                      Thank you guys... My problem here is replaceAll() will replace only the special characters which you specify in the method for a particular string. In my case, i will getting a string[] which may have 10 spl. charcters in it or only 3 spl. characters in it. Because it is dynamic, hence i need to write a code in such a way what ever may be spl characters the system should remove it.
                      In replaceAll(Pattern.Quote("!@#$"),"") will replace only the spl. characters !@#$. If your string[] is having "!@#$%" this replaceAll() may not work. Hence my code should handle the spl. characters which are dynamic my nature. Sorry guys if i have understood wrongly. Expecting a solution for this. Thanks once again for the responses.
                      • 8. Re: Remove all the special characters using java.util.regex
                        699554
                        AnanthJava wrote:
                        Thank you guys... My problem here is replaceAll() will replace only the special characters which you specify in the method for a particular string. In my case, i will getting a string[] which may have 10 spl. charcters in it or only 3 spl. characters in it. Because it is dynamic, hence i need to write a code in such a way what ever may be spl characters the system should remove it.
                        In replaceAll(Pattern.Quote("!@#$"),"") will replace only the spl. characters !@#$. If your string[] is having "!@#$%" this replaceAll() may not work. Hence my code should handle the spl. characters which are dynamic my nature. Sorry guys if i have understood wrongly. Expecting a solution for this. Thanks once again for the responses.
                        Either I am entirely missing what is required or you are as blind as a bat, refer to reply 2.

                        Mel
                        • 9. Re: Remove all the special characters using java.util.regex
                          3004
                          AnanthJava wrote:
                          Thank you guys... My problem here is replaceAll() will replace only the special characters which you specify in the method for a particular string. In my case, i will getting a string[] which may have 10 spl. charcters in it or only 3 spl. characters in it. Because it is dynamic, hence i need to write a code in such a way what ever may be spl characters the system should remove it.
                          In replaceAll(Pattern.Quote("!@#$"),"") will replace only the spl. characters !@#$. If your string[] is having "!@#$%" this replaceAll() may not work. Hence my code should handle the spl. characters which are dynamic my nature. Sorry guys if i have understood wrongly. Expecting a solution for this. Thanks once again for the responses.
                          It's very difficult to understand what you're saying, but I will try to guess. It sounds like you might be saying this:

                          "I want to remove all special characters, but I don't know what all the special characters are. All I know is the normal characters that I want to keep. If I could say
                          replaceAll([special], "")
                          , I would, but I don't know all the characters that make up special."

                          If this is what you're saying, then you should note that
                          replaceAll([special], "")
                          is the same as
                          replaceAll([NOT normal], "")
                          . So if you know all the normal characters, then you just need a way to specify "NOT". For a character class as depicted here, that's the ^ symbol:
                          replaceAll([^normal], "")
                          , where normal is the known set of characters that you want to keep.

                          On the other hand, if you can't fully specify normal either--and also can't fully specify special--then your problem cannot be solved, because you have not defined the requirements precisely enough.
                          • 10. Re: Remove all the special characters using java.util.regex
                            843789
                            Thanks jverd, you understood correctly [^normal] will escape "digits" (numeric) and will not take only alphabets, my string[] will accept only alphanumeric characters.

                            Hence for a String[] str = {"raghav!@#$%^&*()_+()-;<>/ 999"};

                            the output with [^normal] is raghav, it is completely escaping the digits as well. The actual output should be raghav999. This is my problem
                            any thoughts on this is welcome. Once again thanks for your response.
                            • 11. Re: Remove all the special characters using java.util.regex
                              843789
                              AnanthJava wrote:
                              Thanks jverd, you understood correctly [^normal] will escape "digits" (numeric) and will not take only alphabets, my string[] will accept only alphanumeric characters.

                              Hence for a String[] str = {"raghav!@#$%^&*()_+()-;<>/ 999"};

                              the output with [^normal] is raghav, it is completely escaping the digits as well. The actual output should be raghav999. This is my problem
                              any thoughts on this is welcome. Once again thanks for your response.
                              If I understand your problem then this is just rubbish. You seem to be mixing your regular expression and your string to be processed.
                              • 12. Re: Remove all the special characters using java.util.regex
                                699554
                                String[] str = {"raghav!@#$%^&()"};
                                here i want the output to be raghav for the entered string array
                                Reply 2
                                "Raghav!@#$% 999_+<>:'./"
                                the output should be:
                                Raghav999 (without spaces).
                                Reply 2
                                Thank you guys... My problem here is replaceAll() will replace only the special characters which you specify in the method for a particular string. In my case, i will getting a string[] which may have 10 spl. charcters in it or only 3 spl. characters in it. Because it is dynamic, hence i need to write a code in such a way what ever may be spl characters the system should remove it.
                                Reply 2, iterate over the array calling the method defined in f__king Reply 2
                                Hence for a String[] str = {"raghav!@#$%^&*()_+()-;<>/ 999"};
                                the output with ^normal is raghav, it is completely escaping the digits as well. The actual output should be raghav999.
                                In replaceAll(Pattern.Quote("!@#$"),"") will replace only the spl. characters !@#$. If your string[] is having "!@#$%" this replaceAll() may not work.
                                Reply 2

                                Mel
                                • 13. Re: Remove all the special characters using java.util.regex
                                  3004
                                  AnanthJava wrote:
                                  Thanks jverd, you understood correctly [^normal] will escape "digits" (numeric) and will not take only alphabets, my string[] will accept only alphanumeric characters.
                                  No.

                                  [^normal] will discard everything that is not part of what you define as "normal."

                                  You can either specify all the characters you want to keep, or all the characters you want to discard. Which one is easier depends on your specific requirements. Have you actually even looked at a regex tutorial, the docs for the Pattern class, or the other answers you've been given here? There are a lot of character classes that are very easy to express besides just [a-zA-Z]. I don't know why you think "normal" has to mean only that.

                                  For instance, it's easy to specify "keep all alhpanumerics (letters and digits)" or "keep all printable charcters" or "discard all punctuation and whitespace."

                                  The problem here is that you a) have not defined your problem precisely enough and b) have not studied regex thoroughly enough.

                                  >
                                  Hence for a String[] str = {"raghav!@#$%^&*()_+()-;<>/ 999"};

                                  the output with [^normal] is raghav, it is completely escaping the digits as well. The actual output should be raghav999. This is my problem
                                  any thoughts on this is welcome. Once again thanks for your response.
                                  That is only the case if you take an unnecessarily restrictive definition of "normal."
                                  • 14. Re: Remove all the special characters using java.util.regex
                                    843789
                                    Thanks jverd, [^A-Za-z0-9] does the trick.
                                    1 2 Previous Next