Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

using regexp_replace to remove html tags and  

Gor_MahiaMar 10 2014 — edited Mar 11 2014

Hi,

I am trying to use regular expression to remove any html tags/  from a string replacing them with nothing as shown below,

sample= if i enter "hello to the world of<u><p><br>  apex whats coming up" i should get this==> "hello to the world of apex whats coming up"

consider query as, select regexp_replace(string, any html tags/ , 'i') from dual,


so how do i write the expression?

oracle 11g rel2,

thank you.

This post has been answered by KarK on Mar 11 2014
Jump to Answer

Comments

Alessandro Rossi

A replace executes with same order of complexity as the full string scann. So there is no way to do this job with a simpler method than the replace.

This one should do it anyway.

WITH T AS (
	SELECT '< HTML >< P >FIRST ONE< /P >SECOND ONE< /HTML >' AS PAGE FROM DUAL
)
SELECT REGEXP_REPLACE(PAGE,'<[^>]*>',' ') FROM T

Bye Alessandro

Message was edited by:
Alessandro Rossi

The html tags give some problems I don't know how to edit the message.

433062
Hi,

Thanks for your response. I have hundreds of syntax in that HTML text. it would be hard to write syntax since they are dynamic also..

I was thinking their might be some package in oracle which will help me to remove all tags and just return the text between that HTML tags.

Thanks for your reponse though..

Anuj Sharma
Kamal Kishore

if it is a well formed HTML (XML) :

SQL> select xmltype('<HTML><BODY bgColor="#ffffff"> This is the text i want to parse.</BODY></HTML>').extract('/HTML/BODY/text()').getStringVal() from dual ;

XMLTYPE('<HTML><BODYBGCOLOR="#FFFFFF">THISISTHETEXTIWANTTOPARSE.</BODY></HTML>')
--------------------------------------------------------------------------------
 This is the text i want to parse.

1 row selected.

SQL>
muttleychess

When the html is saved in Clob column , I tried to use xmltype like below

select   xmltype(x.ds_corpo).extract('/HTML/BODY/text()').getStringVal() 
  from hist_wkftb_mensagem x
  where x.ds_assunto like '%17016%'
  and x.nu_mensagem =14251148
  order by x.dh_geracao desc

return error

ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00243: element attribute value must be enclosed in quotes
Error at line 1
ORA-06512: at "SYS.XMLTYPE", line 0
ORA-06512: at line 1
SomeoneElse
(dude, you responded to a 6 year old thread)
BluShadow
Regardless of the fact you've dragged up a 6 year old thread... the issue is that the XML in your CLOB is not valid XML. You have attributes in an element that have values that are not surrounded by double quotes e.g.

You have:
<myelement amount=32>
rather than
<myelement amount="32">
Solution: Correct your XML.
muttleychess

thank you all, but my code is only html no xml , and is other application that save in table

<html><head><title>Aprovação de ARC</title></head><body><font face=arial size=2><b>974-17016/ugadiego-2013</b></font><br><br><table border=0><tr><td><b><font face=arial size=1>Data da Abertura</font></b></td>    <td><font face=arial size=1>8/3/2013</font></td><tr><td><b><font face=arial size=1>Quebra Produtividade</font></b></td>    <td><font face=arial size=1>Sim</font></td><tr><td><b><font face=arial size=1>Quantidade</font></b></td>    <td><font face=arial size=1>17,5</font></td><tr><td><b><font face=arial size=1>Valor</font></b></td>    <td><font face=arial size=1>R$ 17496</font></td><tr><td><b><font face=arial size=1>Forma de Indenização</font></b></td>    <td><font face=arial size=1>Nota de Crédito</font></td><tr><td><b><font face=arial size=1>Observação</font></b></td>    <td><font face=arial size=1>Evidenciado a não conformidade do produto em visita a cliente pela assessoria agronômica e qualidade.
Produto apresenta-se empedrado com desuniformidade de grânulos e por consequência geração de finos e falha de óleo.
Produto expedido com GDAP.

Bonificar o cliente em 10% do valor da compra = R$ 17.496,00 ou em toneladas e fertilizantes  que podem ficar em forma de crédito para o cliente retirar em fertilizante para o plantio  da soja. Conforme relatório do Sr. Ademilson Palharin em anexo.</font></td><tr><td><b><font face=arial size=1>Centro de Custo</font></b></td>    <td><font face=arial size=1>CAS1I4671 - MISTURA E ENSAQUE I                     </font></td></table><hr><font face=arial size=2><b>Favor incluir uma Observação (Se necessário) e selecionar o botão desejado para aprovar ou reprovar essa Indenização.</b></font><FORM ACTION='http://10.176.10.123/pgAprovaARCServidor.asp' METHOD='GET' ><font face=arial size=2><div>Observações:</div><textarea name='txtObs' rows='4' cols='60' maxlength='4000'></textarea><br><br><div><input type='submit' value='Aprovar'  name='acao'> <input type='submit' value='Reprovar' name='acao'></div></font><br><hr><font face=arial size=2 >Essa é uma mensagem automática.<br>Favor não responder esse email</font><hr><input type='hidden' name='cdARC' value='17016' ><input type='hidden' name='cdSeq' value='1' ><input type='hidden' name='cdFase' value='Indenizacao' ><input type='hidden' name='dsResp' value='ustrenat' ><input type='hidden' name='dsCargo' value='Vice Presidência' ><input type='hidden' name='dsSolic' value='LESIANE CIESLAK' ><input type='hidden' name='index' value='3' ><input type='hidden' name='rowatu' value='3' ></FORM></body></html>

using oracle 9.2.08

Edited by: muttleychess on Mar 19, 2013 11:36 AM

BluShadow
muttleychess wrote:
thank you all, but my code is only html no xml , and is other application that save in table
HTML is not XML. It may look similar but it's not the same standard. You are trying to turn HTML into XML and clearly, as you've found out, you cannot do that in all cases.
1 - 8
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Apr 8 2014
Added on Mar 10 2014
5 comments
36,717 views