9 Replies Latest reply on May 19, 2010 11:33 AM by PhHein

    Flying Saucer encoding problem

    807580
      Hi, i have a problem with encoding when creating pdf using Flying Saucer. I have xhtml document in UTF-8 and because I'm from the czech republic, i need to use several specific extended characters. The problem is, that Flying Saucer's default encoding is Latin1 (ISO-8859-1), which doesn't support these characters. In the documentation of the FS I have read that you can add specific fonts to support specific encoding by adding font to the font resolver of ITextRenderer class:

      ITextRenderer renderer = new ITextRenderer();
      FontResolver resolver = renderer.getFontResolver();
      resolver.addFont (
      "C:\\WINNT\\Fonts\\ARIALUNI.TTF",
      BaseFont.IDENTITY_H,
      BaseFont.NOT_EMBEDDED
      );

      where "C:\\WINNT\\Fonts\\ARIALUNI.TTF" is the font, "BaseFont.IDENTITY_H" specifies the encoding (in this case UTF-8) and "BaseFont.NOT_EMBEDDED" is another attribute, not important for now.

      But even if I add the font this way, the FS still uses LATIN1 encoding and these characters are displayed wrong. Does anyone know, how to tell FS to use the UTF-8 encoding instead of the default one? Is there anything more I have to set? I was searching the internet and whole documentation for ages but wasn't able to find anything:( any help appreciated..thx
        • 1. Re: Flying Saucer encoding problem
          807580
          The name made me curious, so I googled the project web site. If you use their search facility there appears to be work-arounds:
          https://xhtmlrenderer.dev.java.net/servlets/Search?scope=project&resultsPerPage=40&query=encoding&Button=Go
          • 2. Re: Flying Saucer encoding problem
            807580
            yes, thx pm_kirkham, but i have been also searching on this project website before... only useful info I managed to find was this (quoting from the site):

            -----
            ...
            Additional comments from mabiss Thu Nov 16 15:03:26 +0000 2006

            Also managed to use UTF-8. To achieve that, based on the given fix, i only
            changed my JSP to output UTF-8 XHTML and changed the code in the filter to:

            [[[
            ITextRenderer renderer = new
            ITextRenderer();renderer.getFontResolver().addFont("C:\\WINNT\\Fonts
            ARIALUNI.TTF",
            "UTF-8", BaseFont.NOT_EMBEDDED);
            ]]]

            Looks like i had to use a unicode font with UTF-8 (well, that may be incaccurate
            since i have ppor knowledge on how fonts, PDF and unicode/utf8 works).

            Now i can render arbitary XHTML to PDF with any language in it as long as its
            UTF-8. Great work guys, xhtmlrenderer rocks. Thanks a million!
            Additional comments from peterbrant Sat Apr 14 19:31:25 +0000 2007
            -----

            I'm using identical code, I have correctly specified the unicode font and I'm sure the input file is in UTF-8 encoding, but I'm still getting wrong results (FS still uses LATIN1). I have no idea, what the problem could be...
            • 3. Re: Flying Saucer encoding problem
              807580
              solved
              • 4. Re: Flying Saucer encoding problem
                807580
                solved
                If you though to be appropriate to post the problem here then you might consider posting the solution.
                • 5. Re: Flying Saucer encoding problem
                  807580
                  I've got problems with non latin characters (got empty spaceson their place). Can you show me your sample code? and xhtml too?
                  Here is my xhtml
                  <?xml version="1.0" encoding="UTF-8"?>
                  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
                     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
                  <html xmlns="http://www.w3.org/1999/xhtml">
                  <head>
                    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
                  </head>
                  <body style="font-size:15px">
                  <br />
                  
                  <center  style="font-size:18px">
                  <b><i>&#1057;&#1077;&#1074;&#1077;&#1088;&#1086;-&#1047;&#1072;&#1087;&#1072;&#1076;&#1085;&#1086;&#1077; &#1059;&#1087;&#1088;&#1072;&#1074;&#1083;&#1077;&#1085;&#1080;&#1077;<br /></i></b></center>
                  </body>
                  </html>
                  Here is my JavaCode:
                  String inputFile = "reports/1.xhtml";
                          String url = new File(inputFile).toURI().toURL().toString();
                  
                          String outputFile = "1.pdf";
                          OutputStream os = new FileOutputStream(outputFile);
                  
                  
                          ITextRenderer renderer = new ITextRenderer();
                          renderer.getFontResolver().addFont("c:\\windows\\Fonts\\ARIALUNI.TTF","UTF-8", BaseFont.NOT_EMBEDDED);
                   
                          renderer.setDocument(url);
                  
                          renderer.layout();
                          renderer.createPDF(os);
                  
                          os.close();
                  Edited by: Gellert on Mar 15, 2008 4:33 AM

                  Edited by: Gellert on Mar 15, 2008 5:21 AM
                  • 6. Re: Flying Saucer encoding problem
                    807580
                    Hi!
                    Insert style definition in xhtml:
                    <style type="text/css">
                               name
                               {
                                   font-family: "Arial Unicode MS";
                               }
                    </style>
                    and use this definition in <body> tag. Also you must change Java code line:
                    renderer.getFontResolver().addFont(
                             "c:\\windows\\Fonts\\ARIALUNI.TTF",
                             "UTF-8",
                             BaseFont.NOT_EMBEDDED);
                    to
                    renderer.getFontResolver().addFont(
                             "c:\\windows\\Fonts\\ARIALUNI.TTF", 
                             BaseFont.IDENTITY_H, 
                             BaseFont.EMBEDDED);
                    xhtml may look like this:
                    <?xml version="1.0" encoding="UTF-8"?>
                    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
                       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
                    <html xmlns="http://www.w3.org/1999/xhtml">
                    <head>
                      <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
                      <style type="text/css">
                               name
                               {
                                   font-family: "Arial Unicode MS";
                               }
                      </style>
                    </head>
                    <body style="font-size:15px">
                      <name>
                        <br /> 
                        <center  style="font-size:18px">
                        <b><i>Ñåâåðî-Çàïàäíîå Óïðàâëåíèå<br /></i></b></center>
                      </name>
                    </body>
                    </html>
                    and Java code may look like this:
                    String inputFile = "reports/1.xhtml";
                    String url = new File(inputFile).toURI().toURL().toString();
                     
                    String outputFile = "1.pdf";
                    OutputStream os = new FileOutputStream(outputFile);
                     
                    ITextRenderer renderer = new ITextRenderer();
                    renderer.getFontResolver().addFont("c:\\windows\\Fonts\\ARIALUNI.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED); 
                     
                    renderer.setDocument(url);
                     
                    renderer.layout();
                    renderer.createPDF(os);
                     
                    os.close();
                    Detailed information can be found here: [http://lepetitmonster.blogspot.com/2008/10/google-jmesaflying-sauceritext.html] (you can use Google translate, if you need)
                    • 7. Re: Flying Saucer encoding problem
                      807580
                      I ve got a different problem and I wonder if someone could help me.

                      I am parsing XHTMLs and create PDFs using the Flying Saucer. The problem is that the itextrenderer seems to completely ignore the
                      &#8804
                      (less than or equal to symbol). The same goes for some other symbols like
                      &#8805
                      .

                      Are these tags supported? Is there something I should do before parsing in order to have the correct result? Or are these tags unsupported and therefore the only solution I have is to parse the XHTML manually and then feed the itextrenderer with the result?

                      Thanks in advance
                      • 8. Re: Flying Saucer encoding problem
                        807580
                        Anyone had problem with bolding text which font is different than default??

                        For example
                        renderer.getFontResolver().addFont(fonts\\TAHOMA.TTF", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
                        //or even
                        renderer.getFontResolver().addFont("fonts\\TAHOMA.TTF", true);{code}
                        
                        and in document
                        
                        {code}<font style=\"letter-spacing: 0px; color: #000000; font-size: 10px; font-family: Tahoma; font-weight:bold; \"><strong>bold test</strong></font>{code}
                        
                        It just won't get bold.. Works fine with default font...                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
                        • 9. Re: Flying Saucer encoding problem
                          PhHein
                          Welcome to the forum. Please don't post in threads that are long dead and don't hijack other threads. When you have a question, start your own topic. Feel free to provide a link to an old post that may be relevant to your problem.

                          I'm locking this thread now.