Showing posts with label pdfbox. Show all posts
Showing posts with label pdfbox. Show all posts

Thursday, October 15, 2015

Apache PDFBox - A Java PDF Library

I wanted to share my recent learning on PDF creation using PDFBox Java API. More technical details about this documented below.

The Apache PDFBox library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents.

Features:

Create PDFs:
 Create a PDF from scratch, with embedded fonts and images.

Signing Digitally:
 Sign PDF files.

Print:
 Print a PDF file using the standard Java printing API.

Preflight:
 Validate PDF files against the PDF/A-1b standard.

Fill Forms:
 Extract data from PDF forms or fill a PDF form.

Split & Merge:
 Split a single PDF into many files or merge multiple PDF files.

Extract Text:
 Extract Unicode text from PDF files. Save as Image Save PDFs as image files, such as PNG or JPEG.

Other alternative frameworks/tools to generate PDF in java:
 iText: nowadays iText is a commercial library, the latest version is not for free anymore.
 FOP: I worked a lot with FOP. It's fairly resource intensive (Java > XML > XSLT > PDF) and complex PDFs become a nightmare ( may result in XSLTs with 20k+ LoC).
 PDFBox: it seems to be the best alternative although I did not work with it in large project.


FOP is much performance optimized solution to create a PDF. It also comes with few limitations like digital signing, split&merge etc. iText and PDFBox offers almost same features from its library. Please be aware that iText is much performance optimized solution compare to PDFBox as the parsing techniques used.


Happy Learning,
Nanjundan Chinnasamy

Pega Decisioning Consultant - Mission Test Quiz & Answers

The Pega Certified Decisioning Consultant (PCDC) certification is for professionals participating in the design and development of a Pega ...