iText In Action

In the session iText in action Bruno Lowagie talks about his new edition of the book.

Bruno started with pointing out that he knows PDF better then he knows Java.
In the first part Bruno show in a hands on session how he created the devoxx programme guide using the devoxx rest interface for retrieving the data and the iText library to create the pdf version.

During the creation of the guide the viewer learned how to keep track of the page count to insert the page numbers next to the speakers bio. Also how to use acro-forms as some kind of a template engine. Together with this Bruno demonstrated that it’s best to use factories for your redundant data and styling.

During the whole session Bruno used a tool he created to view the pdf document structure (iText RUPS), where you can really see the internal document build up of a pdf. Here you can see that pdf is built up with objects, and that redundant objects can better be avoided to reduce the file size. When copying content from one pdf to another you can use the fast PdfCopy class or PdfSmartCopy. PdfSmartCopy keeps track of the objects hashes and when equal objects are found it will insert a reference rather then inserting the object twice.

A note about RUPS: it’s not yet officially released since RU stands for read and update, but the update functionality is not yet there.

In the second part we got an in depth overview of the books second edition with Bruno explaining the purpose of each chapter. With a focus on the chapter about XFA-forms. These forms allow pdf’s to be populated through xml documents that contain your data with an example. Another focus on was chapter 12 which goes about protecting your data and adding signatures to your documents, like timestamped signatures that are used to certify and validate documents.

For the second edition Bruno started to write the book from scratch. Manning demands that books can be read from back to back. Bruno chose to follow this rule throughout most of the book, but he made an exception for chapter 14. That chapter consists of a series of tables mapping all the graphical and text operators and operands available in PDF to the corresponding methods in iText. Such a chapter is weird for a Manning book, but Bruno convinced his publisher that it would be very useful for developers.

A really nice chapter is 13, going though the pdf spec itself and was reviewed by pdf guru Jim King.

We get an example of how to read and create pdf’s with iText (chapter 15) allowing you to recreate layout and data from from e.g. xml documents or other pdfs by using PDF Tags. You can parse an XML file to convert its content to ‘ordinary’ PDF, but if you also parse the XML for its structure to create a ‘Tagged PDF’, you allow more accessibility for readers for e.g. blind people. iText can also convert Tagged PDFs to XML which initially wasn’t in the iText 5 release but made it into the release when Bruno was creating a demo and wrote that code.

Of course a demonstration of adding flash components to a pdf for non static data could not be left out. This enhances pdf with variable date allowing a pdf reader to fetch data from e.g. the internet. Naturally this is secured, the reader will ask the user if a connection can be made. With embedding flash (which can call javascript from inside the pdf and vice versa) into pdfs, the documents become really interactive.

As an end note, we get Joachim Van der Auwera who’s giving a demo about using iText in geomajas, a webmapping framework built with java and gwt, that uses iText to output high resolution maps in pdfs. (On the funny side, Bruno himself also showed off a pdf like that. But his layers could be enabled/disabled. Geomajas can learn from that.)

What do we remember? euh.. this session is worth looking at on parleys.com once it’s there and you want to know what iText is about. And more: BUY the Book if you have any plans using iText. It’s a good reference !!

Thanks Bruno for this interesting session.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.