I wanted to share my recent
learning on PDF creation using PDFBox Java API. More technical details about
this documented below.
The Apache PDFBox library is an open source
Java tool for working with PDF documents. This project allows creation of new
PDF documents, manipulation of existing documents and the ability to extract
content from documents.
Features:
Create PDFs:
Create a PDF from scratch, with embedded fonts
and images.
Signing Digitally:
Sign PDF files.
Print:
Print a PDF file using the
standard Java printing API.
Preflight:
Validate PDF files against
the PDF/A-1b standard.
Fill Forms:
Extract data from PDF forms
or fill a PDF form.
Split & Merge:
Split a single PDF into many
files or merge multiple PDF files.
Extract Text:
Extract Unicode text from PDF files. Save as
Image Save PDFs as image files, such as PNG or JPEG.
Other alternative frameworks/tools to generate
PDF in java:
• iText: nowadays iText is a commercial library, the
latest version is not for free anymore.
• FOP: I worked a lot with FOP. It's fairly resource
intensive (Java > XML > XSLT > PDF) and complex PDFs become a
nightmare ( may result in XSLTs with 20k+ LoC).
• PDFBox: it seems to be the best alternative although I
did not work with it in large project.
More details about this can
be found from my previous blog at http://nanjundanonlinedictionary.blogspot.co.uk/2012/06/wanted-to-generate-pdf-and-ms-excel.html
FOP is much performance optimized solution to create a PDF. It also comes with few limitations like digital signing, split&merge etc. iText and PDFBox offers almost same
features from its library. Please be aware that iText is much performance optimized solution compare to PDFBox as the parsing techniques used.
Happy Learning,
Nanjundan Chinnasamy