3 Replies Latest reply: Jan 6, 2013 4:02 AM by Bill Shannon-Oracle RSS

    Get content from e-mail with text/html as content type

    981862
      Hey there,

      I am working on an e-mail application in Java. This application basically enters a mailbox every few minutes and loops trough unread e-mails for certain subjects. If a subject is found I want to retreive the content of said e-mail. Retreiving the content works fine with e-mails sent from Gmail, Outlook (desktop client), Hotmail.

      However, when I am trying to get the content of an e-mail sent by an Office 365 webclient I get returned an text/html content type. I printed the content and found out it exists out of HTML code. But this HTML code isn't a good format:


      +<html dir=3D"ltr">+
      +<head>+
      +<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Diso-8859-=+
      +1">+
      +<style type=3D"text/css" id=3D"owaParaStyle"></style>+
      +</head>+
      +<body fpstyle=3D"1" ocsi=3D"0">+
      +<div style=3D"direction: ltr;font-family: Tahoma;color: #000000;font-size: =+
      +10pt;"> +
      +<div><span style=3D"font-family: 'Segoe UI', Helvetica, Arial, sans-serif;"=+
      +>Geachte heer/mevrouw,</span><br style=3D"font-family: 'Segoe UI', Helvetic=+
      +a, Arial, sans-serif;">+
      +<br style=3D"font-family: 'Segoe UI', Helvetica, Arial, sans-serif;">+
      +<span style=3D"font-family: 'Segoe UI', Helvetica, Arial, sans-serif;">Wij =+
      hebben uw inzending ontvangen en gecontroleerd. Hierbij het verslag van</sp=
      an><br style=3D"font-family: 'Segoe UI', Helvetica, Arial, sans-serif;">
      +<span style=3D"font-family: 'Segoe UI', Helvetica, Arial, sans-serif;">de c=+
      ontrole.</span><br style=3D"font-family: 'Segoe UI', Helvetica, Arial, sans=
      -serif;">

      Note the = symbols.

      When I am trying to get the content of Gmail, Outlook or Hotmail I get back text/plain as content type, just text no HTML or random symbols.

      How can I solve this, I tried parsing the content with Jsoup, but the random = symbols cause problems.

      Any help is appreciated,

      Thanks!
        • 1. Re: Get content from e-mail with text/html as content type
          Bill Shannon-Oracle
          This was answered in your stackoverflow post:
          http://stackoverflow.com/questions/14066910/parse-text-html-data-with-javamail
          • 2. Re: Get content from e-mail with text/html as content type
            981862
            bshannon wrote:
            This was answered in your stackoverflow post:
            http://stackoverflow.com/questions/14066910/parse-text-html-data-with-javamail
            I know, but isn't there another way to get the content of a part which is marked with content type = "text/html".

            part.writeTo(OutputStream) gives back the whole raw message including headers. Plus it will print line-break characters. For other parts I know you can just do part.getContent to get it's content.

            part.getInputStream doesn't seem to work for me. I get back an empty line when printing the stream.

            I only need the HTML part of the message, I tried to remove all headers by doing the following:

            Enumeration headers = part.getAllHeaders();
            while (headers.hasMoreElements()) {
            Header h = (Header) headers.nextElement();
            System.out.println(h.getName() + ": " + h.getValue());
            part.removeHeader(h.getName());
            }

            But I get the following exception: javax.mail.IllegalWriteException: IMAPMessage is read-only

            while I've opened the folder as Read and Write. I really don't know how to continue with this,

            Help is very much appreciated!

            Edited by: 978859 on 4-jan-2013 6:59

            Edited by: 978859 on 4-jan-2013 7:04
            • 3. Re: Get content from e-mail with text/html as content type
              Bill Shannon-Oracle
              If you're sure the message has data and the getInputStream method isn't returning the data, we'll need to do some debugging.
              Find the JavaMail FAQ and read the debugging section. Then post the protocol trace showing what happens when you use
              getInputStream. Also, try using the msgshow.java demo program to read the message. That will help determine whether there's
              a bug in your code, a bug in the server, or something else is wrong.