itext 1.5 Html to Pdf

Hello all and Welcome to my first blog post,

I have recently ran into a big problem on how to use iText 1.5 in order to parse a HTML that was converted to a String into a pdf file. The project that I am working on is quite old and we are trying to migrate it from Java 1.4 to Java 6. With that being said we are also trying to use the latest versions on all our jars.

The old itext had also an itextXml library that used a SaxmyHtmlHandler class to handle the content and parsed the content using a SaxParser. So I begun researching for ways to rewrite SaxmyHtmlHandler with the new itext library since itextXml.jar wasn’t available for the new version. The documentation is very weak and lacks in giving the right examples. On the lowagie site the examples are useless since they write their on parsers for a very particular case not a general case like  SaxmyHtmlHandler treated.

After days of researching I had discovered HTMLWorker. I have seen it before in their lib but no examples were there to explain how should it be done properly. I will give you the code below.  It should be put in void main(String args[]) method. Hope this helps!

try {

com.itextpdf.text.Document document = new com.itextpdf.text.Document(PageSize.A4);
PdfWriter pdfWriter = PdfWriter.getInstance(document, new FileOutputStream(“D://testpdf.pdf”));
document.open();
document.addAuthor(“Author of the Doc”);
document.addCreator(“Creator of the Doc”);
document.addSubject(“Subject of the Doc”);
document.addCreationDate();
document.addTitle(“This is the title”);

//SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
//SAXmyHtmlHandler shh = new SAXmyHtmlHandler(document);

HTMLWorker htmlWorker = new HTMLWorker(document);
String str = “<html><head><title>titlu</title></head><body><table><tr><td><p style=’font-size: 10pt; font-family: Times’>” +
“Cher Monsieur,</p><br><p align=’justify’ style=’text-indent: 2em; font-size: 10pt; font-family: Times’>” +
“asdasdasdsadas<br></p><p align=’justify’ style=’text-indent: 2em; font-size: 10pt; font-family: Times’>” +
“En vous remerciant &agrave; nouveau de la confiance que vous nous t&eacute;moignez,</p>” +
“<br><p style=’font-size: 10pt; font-family: Times’>Bien Cordialement,<br>” +
“<br>ADMINISTRATEUR ADMINISTRATEUR<br>Ligne directe : 04 42 91 52 10<br>Acadomia&reg; – ” +
“37 BD Aristide Briand  – 13100 Aix en Provence  </p></td></tr></table></body></html>”;
htmlWorker.parse(new StringReader(str));

document.close();

} catch(DocumentException e) {
e.printStackTrace();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}