I am fumbling with some regular expression issue. I am trying to separate out sentences in a paragraph. I tried using regEx as [?!.] to tokenize. But the issue I faced was that it also broke the sentence at the fractions.
e.g. I am using iphone 3.2 but I am not happy with it.
This sentence was broken down into
I am using iphone 3
2 but I am not happy with it
I tried using regEx as [.!?]^[[0-9].[0-9]] to exclude all instances of '.' with number before and after it but it still failed. I have just started learning regEx so I am not sure what is the right way to do this. Any inputs?
As sabre150 already suggested, you could split on one of [.?!] followed directly by a white space character, but tokenizing English (or other human languages) can't simply (and reliably) be done with a simple split(...).