Introduction

This tutorial will show you how to convert pdf to image file using Java. For this I am using here pdfbox API. Java pdf to image example will show you step by step conversion procedure.

In the recent version (2.0.20) of the pdfbox library many methods were removed along with getAllPages() and convertToImage() methods.

In this example I will show you how to convert PDF file into Image file using 1.8.3 as well as 2.0.20 versions of pdfbox library.

Related Posts:

Prerequisites

Knowledge of Java, At least JDK 1.8, Maven 3.6.3 or Gradle 6.4.1, PdfBox 1.8.3 and 2.0.20

Setup Project

Create a maven or gradle based standalone project in Eclipse. The name of the project is java-pdf-to-image.

If you are using maven based project then you can use below pom.xml file:

<project xmlns="http://maven.apache.org/POM/4.0.0"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>
	<groupId>com.roytuts</groupId>
	<artifactId>java-pdf-to-image</artifactId>
	<version>0.0.1-SNAPSHOT</version>
	<packaging>jar</packaging>
	
	<properties>
		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
	</properties>
	
	<dependencies>
		<dependency>
			<groupId>org.apache.pdfbox</groupId>
			<artifactId>pdfbox</artifactId>
			<version>1.8.3 or 2.0.20</version>
		</dependency>
	</dependencies>
	
	<build>
		<plugins>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-compiler-plugin</artifactId>
				<version>3.8.1</version>
				<configuration>
					<source>at least java 1.8</source>
					<target>at least java 1.8</target>
				</configuration>
			</plugin>
		</plugins>
	</build>
</project>

If you are using gradle based project then you can use below build.gradle script. You can change the version of pdfbox according to your requirement.

plugins {
    id 'java-library'
}

repositories {
    jcenter()
}

dependencies {
    implementation 'org.apache.pdfbox:pdfbox:2.0.20'
}

Java Class

The below Java class converts PDF file into Image file. The output image file will be PNG type.

If you are using pdfbox 1.8.3 then you can use below code.

package com.roytuts.java.pdf.to.image;

import java.awt.image.BufferedImage;
import java.io.File;
import java.util.List;
import javax.imageio.ImageIO;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
public class ConvertPdfToImage {
	public static void main(String[] args) {
		try {
			String sourceDir = "C:/Desktop/sample.pdf";
			String destinationDir = "C:/Desktop/";
			File sourceFile = new File(sourceDir);
			File destinationFile = new File(destinationDir);
			if (!destinationFile.exists()) {
				destinationFile.mkdir();
				System.out.println("Folder Created -> " + destinationFile.getAbsolutePath());
			}
			if (sourceFile.exists()) {
				PDDocument document = PDDocument.load(sourceDir);
				@SuppressWarnings("unchecked")
				List<PDPage> list = document.getDocumentCatalog().getAllPages();
				String fileName = sourceFile.getName().replace(".pdf", "");
				int pageNumber = 1;
				for (PDPage page : list) {
					BufferedImage image = page.convertToImage();
					File outputfile = new File(destinationDir + fileName + "_" + pageNumber + ".png");
					ImageIO.write(image, "png", outputfile);
					pageNumber++;
				}
				document.close();
				System.out.println("Image saved at -> " + destinationFile.getAbsolutePath());
			} else {
				System.err.println(sourceFile.getName() + " File does not exist");
			}
		} catch (Exception e) {
			e.printStackTrace();
		}
	}
}

First we get the source from where we want to read the pdf file and destination directory where we want to write the converted image file.

Next we create the required destination directories if they do not exist.

Then we read the pdf file and retrieve all pages and for each page we generate the image file in the destination directory.

If you are using pdfbox 2.0.20 version then you can use below code:

package com.roytuts.java.pdf.to.image;

import java.awt.image.BufferedImage;
import java.io.File;

import javax.imageio.ImageIO;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;

public class PdfToImageConverter {

	public static void main(String[] args) {
		try {
			String destinationDir = "C:\\Users\\Desktop\\pdf-to-image\\";

			File sourceFile = new File("C:\\Users\\Desktop\\sample.pdf");
			File destinationFile = new File(destinationDir);

			if (!destinationFile.exists()) {
				destinationFile.mkdir();
				System.out.println("Folder Created -> " + destinationFile.getAbsolutePath());
			}

			if (sourceFile.exists()) {
				PDDocument document = PDDocument.load(sourceFile);
				PDFRenderer pdfRenderer = new PDFRenderer(document);

				String fileName = sourceFile.getName().replace(".pdf", "");

				// int pageNumber = 0;

				// for (PDPage page : document.getPages()) {
				for (int pageNumber = 0; pageNumber < document.getNumberOfPages(); ++pageNumber) {
					BufferedImage bim = pdfRenderer.renderImage(pageNumber);

					String destDir = destinationDir + fileName + "_" + pageNumber + ".png";

					ImageIO.write(bim, "png", new File(destDir));
				}

				document.close();

				System.out.println("Image saved at -> " + destinationFile.getAbsolutePath());
			} else {
				System.err.println(sourceFile.getName() + " File does not exist");
			}
		} catch (Exception e) {
			e.printStackTrace();
		}
	}

}

Testing the Application

Input pdf file

Output Image file

Page 1

pdf to image

Page 2

pdf to image

Source Code

download for version 1.8.3

download for version 2.0.20

Thanks for reading.

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *