This discussion is archived
5 Replies Latest reply: Jul 18, 2007 3:23 AM by 807605 RSS

Finding URLs using regular expression.

807605 Newbie
Currently Being Moderated
I have an requirement where user will type some text containing URLs like "Please visit this site http://www.google.com/e/qHvQcWco`~!@#$%^&*()-7747. Thank you". This text has to be modified as below before saving it to the database.

"Please visit this site <a href='http://www.google.com/e/qHvQcWco`~!@#$%^&*()-7747'>http://www.google.com/e/qHvQcWco`~!@#$%^&*()-7747</a>. Thank you"

I am using regular expression (http|https)://.+?\\s which marks the end of the url with a white space character.This pattern doesn't work if the URL is located at the end of the string since there will be no space at the end.

For example if the string is "Please visit this site http://www.google.com/e/qHvQcWco`~!@#$%^&*()-7747" the regex will fail.

My acutal problem is to find the URL irrespective its position within the string.
 
Pattern urlPattern = Pattern.compile("(http|https)://.+?\\s", Pattern.CASE_INSENSITIVE); 

Matcher matcher = urlPattern.matcher(plainText); 

Map stringIndexMap = new HashMap(); 

//Searching the input string for urlPattern... 

while(matcher.find()) { 

String urlString = matcher.group(); 

//Storing the urls in a hashmap with their indices as keys.... 

stringIndexMap.put(new Integer(matcher.start()), urlString.trim()); 

} 

Set keySet = stringIndexMap.keySet(); 

Iterator it = keySet.iterator(); 



//Iterating over the hashmap containing urls... 

while(it.hasNext()) { 

String urlString = (String) stringIndexMap.get(it.next()); 

/* 

* Replacing the url string in the input text with <a href="#" onclick="window.open('<urlString>')" 

* using String index 

*/ 

clickableURLString.replace(clickableURLString.indexOf(urlString), 

clickableURLString.indexOf(urlString) + urlString.length(), 

"<a href=\"#\" onclick=\"window.open('" + urlString 

+ "')\">" + urlString + "</a>"); 

} 

return clickableURLString.toString();