How to detect and delete empty or blank rows from excel file using Apache POI in Java

Introduction

In this guide I will show you how to detect and delete empty rows from excel file using Apache POI in Java. I will see also how to read and write excel file using Apache POI. If you have few cells or columns in a row then you can deal with those empty cells but if you have entire row empty or blank then you don’t need to do any activity on the empty row. If you are generating output excel file then also you don’t need to repeat those empty rows in output excel file. You simple skip those empty rows.

Apache POI is the Java based library that is used to create and maintain for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft’s OLE 2 Compound Document format (OLE2). You can read and write MS Excel files using Java. In addition, you can read and write MS Word and MS PowerPoint files using Java.

Here I am going to show you how would you detect empty rows and skip those empty or blank rows and write only the rows which have data into an output excel file.

Prerequisites

Java at least 8, Apache POI 4.1.0 – 5.0.0, Gradle 5.6 – 6.7.1, Maven 3.6.3

Project Setup

Create a gradle or maven based project in your favorite IDE or tool. The name of the project is excel-poi-detect-and-delete-empty-rows.

The default generated build.gradle script has to be modified to include the required Apache POI library.

plugins {
    id 'java-library'
}

sourceCompatibility = 12
targetCompatibility = 12

repositories {
    jcenter()
}

dependencies {
    implementation 'org.apache.poi:poi-ooxml:4.1.0' to 5.0.0
}

If you are creating maven based project then you can use the following pom.xml file:

<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>

	<groupId>com.roytuts</groupId>
	<artifactId>excel-poi-detect-and-delete-empty-rows</artifactId>
	<version>0.0.1-SNAPSHOT</version>

	<properties>
		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
		<maven.compiler.source>12</maven.compiler.source>
		<maven.compiler.target>12</maven.compiler.target>
		<poi.version>5.0.0</poi.version>
	</properties>

	<dependencies>
		<!-- apache poi for xlsx, docx etc reading/writing -->
		<dependency>
			<groupId>org.apache.poi</groupId>
			<artifactId>poi-ooxml</artifactId>
			<version>${poi.version}</version>
		</dependency>
	</dependencies>

	<build>
		<plugins>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-compiler-plugin</artifactId>
				<version>3.8.1</version>
			</plugin>
		</plugins>
	</build>
</project>

Excel File

I will create an excel file which has few empty or blank rows. You may also need to deal with blank or empty cells or columns using Apache POI library.

The following image shows the empty rows as highlighted with red box in excel file:

detect and delete empty or blank rows from excel file using apache poi in java

You can download the sample input excel file from the download section at the end of the tutorial.

Delete Empty Rows from Excel File

Now I will see how to read the excel file, how to detect empty or blank rows, how to deal with different cell types and how to write to output excel file.

package com.roytuts.excel.poi.detect.and.delete.empty.rows;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.List;

import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.DataFormatter;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public final class ExcelHandler {

	private ExcelHandler() {
	}

	public static List<String[]> extractInfo(File file) {
		Workbook wb = null;

		List<String[]> infoList = new ArrayList<String[]>();

		try {
			wb = new XSSFWorkbook(new FileInputStream(file));

			Sheet sheet = wb.getSheetAt(0);

			for (Row row : sheet) {
				if (isRowEmpty(row)) {
					continue;
				}

				List<Cell> cells = new ArrayList<Cell>();

				int lastColumn = Math.max(row.getLastCellNum(), 5);

				for (int cn = 0; cn < lastColumn; cn++) {
					Cell c = row.getCell(cn, Row.MissingCellPolicy.RETURN_BLANK_AS_NULL);
					cells.add(c);
				}

				String[] cellvalues = extractInfoFromCell(cells);
				infoList.add(cellvalues);
			}
		} catch (IOException e) {
			e.printStackTrace();
		} finally {
			if (wb != null) {
				try {
					wb.close();
				} catch (IOException e) {
					e.printStackTrace();
				}
			}
		}

		return infoList;
	}

	private static String[] extractInfoFromCell(List<Cell> cells) {
		String[] cellValues = new String[5];

		cellValues[0] = getCellValue(cells.get(0));

		cellValues[1] = getCellValue(cells.get(1));

		cellValues[2] = getCellValue(cells.get(2));

		cellValues[3] = getCellValue(cells.get(3));

		cellValues[4] = getCellValue(cells.get(4));

		return cellValues;
	}

	private static String getCellValue(Cell cell) {
		String val = "";

		switch (cell.getCellType()) {
		case NUMERIC:
			val = String.valueOf(cell.getNumericCellValue());
			break;
		case STRING:
			val = cell.getStringCellValue();
			break;
		case BLANK:
			break;
		case BOOLEAN:
			val = String.valueOf(cell.getBooleanCellValue());
			break;
		case ERROR:
			break;
		case FORMULA:
			break;
		case _NONE:
			break;
		default:
			break;
		}

		return val;
	}

	private static boolean isRowEmpty(Row row) {
		boolean isEmpty = true;
		DataFormatter dataFormatter = new DataFormatter();

		if (row != null) {
			for (Cell cell : row) {
				if (dataFormatter.formatCellValue(cell).trim().length() > 0) {
					isEmpty = false;
					break;
				}
			}
		}

		return isEmpty;
	}

	public static void writeToExcel(List<String[]> cellValues, File outputFile) throws IOException {
		Workbook wb = new XSSFWorkbook();

		OutputStream outputStream = new FileOutputStream(outputFile);

		Sheet sheet = wb.createSheet();

		int rows = cellValues.size();
		int cells = cellValues.get(0).length;

		for (int i = 0; i < rows; i++) {
			Row row = sheet.createRow(i);

			for (int j = 0; j < cells; j++) {
				Cell cell = row.createCell(j);
				cell.setCellValue(cellValues.get(i)[j]);
			}
		}

		wb.write(outputStream);
		wb.close();
	}
}

In the above class we read excel, handle cell value and extract the cell value finally. The most important method is isRowEmpty(Row row), which takes a single excel row object as an argument and detects whether the row is empty or not. If row is empty we simply skip it rather than extracting the cell value.

Finally I write the cell values to a different excel sheet in a file using the method writeToExcel().

Main Class

I need to create a main class to test our application whether it gives us correct output or not.

package com.roytuts.excel.poi.detect.and.delete.empty.rows;

import java.io.File;
import java.io.IOException;
import java.util.List;

import org.apache.poi.openxml4j.exceptions.InvalidFormatException;

public class ExcelCellInfo {

	public static void main(String[] args) throws InvalidFormatException, IOException {
		List<String[]> cellValues = ExcelHandler.extractInfo(new File("C:/jee_workspace/info.xlsx"));

		cellValues.forEach(c -> System.out.println(c[0] + ", " + c[1] + ", " + c[2] + ", " + c[3] + ", " + c[4]));

		ExcelHandler.writeToExcel(cellValues, new File("C:/jee_workspace/deleted_empty_rows_info.xlsx"));
	}

}

Inside the main method I first get the cell values from the input excel file. In the next line I iterate through those cell values and print on console.

Finally I write to the output excel file from the cell values.

Testing the Application

When you run the above class you will get output on console as shown below:

Name, Mobile, Phone, Permanent Address, Communication Address
Loku, 9.433578459E9, 3.3456708921E10, Birati, Kolkata
Sumit, 9.836547485E9, 7.5436789213E10, Dankuni, Kolkata
Souvik, 9.051478954E9, 3.3983456789E10, Kalighat, Kolkata
Liton, 8.754902345E9, 3.3457849651E10, Siliguri, Kolkata

In the above output you will see wired output for mobile and phone number. Actually numeric values are represented in exponential format in excel. You have to tweak a bit to make those cells as string in excel to get the exact output.

The generated output file you will with the following data:

detect and delete empty or blank rows from excel file using apache poi in java

So your finally generated excel file has no blank rows.

Hope you got idea how to detect and delete empty or blank rows from excel file using Java.

Source Code

Download

Leave a Reply

Your email address will not be published. Required fields are marked *