Spring Data JPA Batch Insertion

Introduction

The example, Spring Data JPA Batch Insertion, will show you how you can insert a large dataset into a database at once using Spring Data JPA. For this tutorial I will create a Spring Boot project in Eclipse. I will also see how Spring @Transactional annotation works. Spring transaction required in order to rollback the inserted data at any point if your application fails for any reason.

Related Posts:

Sometimes you need to insert or update a large number of records in the database. It’s not a good idea to insert multiple records into database one by one in a traditional approach. It will hit the application’s performance.

Spring provides batch operations with the help of JpaRepository or CrudRepository, which inserts or updates records into database in one shot.

You can also use JDBC API to insert multiple records or batch insertion into database but here I will use Spring JPA’s built-in functionality to get benefits of Spring API.

By default Spring does not save your data into database when you call saveAll() method with multiple entities (a list of objects) passed as argument, hence you have to save entities one by one, which is time consuming and performance gets affected. For this there are few properties that need to be configured to let Spring Data JPA work on batch insertion into database. I will see how these properties set during creation of database configuration class below.

Prerequisites

At least JDK 1.8, Maven 3.6.3, Gradle 6.1.1 – 6.7.1, Spring Boot 2.2.6 – 2.4.3, MySQL 8.0.17 – 8.0.22

MySQL Table

I will create a table called employee under roytuts database in MySQL server.

CREATE TABLE IF NOT EXISTS `employee` (
  `id` int unsigned COLLATE utf8mb4_unicode_ci NOT NULL AUTO_INCREMENT,
  `name` varchar(40) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

Project Setup

Now I will see the below steps how to create a gradle based spring project in your favorite tool or IDE to work on example Spring Data JPA Batch Insertion. You can also create maven based project in Eclipse. The name of the project is springdata-jpa-batch-insertion.

In this file notice I have applied required plugins and added required dependencies, such as spring-boot-starter-web, spring-boot-starter-data-jpa and oracle jdbc driver to interact with Java and database API.

If you are creating gradle based project then use below build.gradle script:

buildscript {
	ext {
		springBootVersion = '2.2.6.RELEASE' to 2.4.3
	}
	
    repositories {
    	mavenLocal()
    	mavenCentral()
    }
    
    dependencies {
    	classpath("org.springframework.boot:spring-boot-gradle-plugin:${springBootVersion}")
    }
}

plugins {
    id 'java-library'
    id 'org.springframework.boot' version "${springBootVersion}"
}

sourceCompatibility = 12
targetCompatibility = 12

repositories {
	mavenLocal()
    mavenCentral()
}

dependencies {
	implementation("org.springframework.boot:spring-boot-starter-web:${springBootVersion}")
	implementation("org.springframework.boot:spring-boot-starter-data-jpa:${springBootVersion}")
	implementation('mysql:mysql-connector-java:8.0.17')
	
	//required for jdk 9 or above
	runtimeOnly('javax.xml.bind:jaxb-api:2.4.0-b180830.0359')
}

If you are creating maven based project, then you can use below pom.xml file:

<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>

	<groupId>com.roytuts</groupId>
	<artifactId>spring-data-jpa-batch-insertion</artifactId>
	<version>0.0.1-SNAPSHOT</version>

	<properties>
		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
		<maven.compiler.source>12</maven.compiler.source>
		<maven.compiler.target>12</maven.compiler.target>
	</properties>

	<parent>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-parent</artifactId>
		<version>2.2.6 to 2.4.3</version>
	</parent>

	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-web</artifactId>
		</dependency>

		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-data-jpa</artifactId>
		</dependency>

		<dependency>
			<groupId>mysql</groupId>
			<artifactId>mysql-connector-java</artifactId>
		</dependency>

		<dependency>
			<groupId>javax.xml.bind</groupId>
			<artifactId>jaxb-api</artifactId>
			<scope>runtime</scope>
		</dependency>
	</dependencies>

	<build>
		<plugins>
			<plugin>
				<groupId>org.springframework.boot</groupId>
				<artifactId>spring-boot-maven-plugin</artifactId>
			</plugin>
		</plugins>
	</build>
</project>

Create application.properties

Below is the application.properties file under classpath directory src/main/resources and you need to define database credentials to establish connection with database.

Also if you do not want to run server on default port then you may want to specify the server port using server.port key.

Here in Spring Data JPA Batch Insertion example, I am going to use MySQL database but you may use any database as per your requirements.

#datasource
spring.datasource.driverClassName=com.mysql.cj.jdbc.Driver
spring.datasource.url=jdbc:mysql://localhost/roytuts
spring.datasource.username=root
spring.datasource.password=root

#server.port=9999

#disable schema generation from Hibernate
spring.jpa.hibernate.ddl-auto=none

Configuration Class

Below is the configuration class that will be used to define various database related beans such as DataSource, EntityManagerFactory etc.

As you know JPA is a specification or Interface and someone has to provide its implementation, so here I am using Hibernate as an implementation of JPA API.

As I have application.properties file in classpath (under src/main/resources), so I don’t need to load the properties file.

I have let Spring know where our Spring Data JPA Repository interfaces using the annotation @EnableJpaRepositories and I have also let Spring know where to look for Entity classes using the setter method factory.setPackagesToScan("<package name>").

By default Spring does not work when you want to insert multiple records or entities using saveAll() method of JpaRepository or CrudRepository and that’s why you need to set few properties into LocalContainerEntityManagerFactoryBean as shown below in entityManagerFactory() method. You can change the batch size, here I have put 500.

By default Spring transaction works out of the box so you may not need to annotate the configuration class with @EnableTransactionManagement.

package com.roytuts.spring.data.jpa.batch.insertion.config;

import java.util.Properties;

import javax.persistence.EntityManagerFactory;
import javax.sql.DataSource;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.env.Environment;
import org.springframework.data.jpa.repository.config.EnableJpaRepositories;
import org.springframework.jdbc.datasource.DriverManagerDataSource;
import org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean;
import org.springframework.orm.jpa.vendor.Database;
import org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter;
import org.springframework.transaction.annotation.EnableTransactionManagement;

@Configuration
@EnableTransactionManagement
@EnableJpaRepositories(basePackages = "com.roytuts.spring.data.jpa.batch.insertion.repository")
public class DbConfig {

	@Autowired
	private Environment environment;

	@Bean
	public DataSource dataSource() {
		DriverManagerDataSource ds = new DriverManagerDataSource();
		ds.setDriverClassName(environment.getRequiredProperty("spring.datasource.driverClassName"));
		ds.setUrl(environment.getRequiredProperty("spring.datasource.url"));
		ds.setUsername(environment.getRequiredProperty("spring.datasource.username"));
		ds.setPassword(environment.getRequiredProperty("spring.datasource.password"));
		return ds;
	}

	@Bean
	public EntityManagerFactory entityManagerFactory(DataSource dataSource) {
		HibernateJpaVendorAdapter vendorAdapter = new HibernateJpaVendorAdapter();
		vendorAdapter.setDatabase(Database.MYSQL);
		// vendorAdapter.setDatabasePlatform("org.hibernate.dialect.MySQL8Dialect");

		// Use these properties to let spring work on batch insertion
		Properties jpaProperties = new Properties();
		jpaProperties.put("hibernate.jdbc.batch_size", 500);
		jpaProperties.put("hibernate.order_inserts", true);
		jpaProperties.put("hibernate.order_updates", true);

		LocalContainerEntityManagerFactoryBean factory = new LocalContainerEntityManagerFactoryBean();
		factory.setJpaVendorAdapter(vendorAdapter);
		factory.setJpaProperties(jpaProperties);
		factory.setPackagesToScan("com.roytuts.spring.data.jpa.batch.insertion.entity");
		factory.setDataSource(dataSource);
		factory.afterPropertiesSet();
		return factory.getObject();
	}

}

Entity Class

This is the entity class that maps Java object to database table. This entity class represents a single row in database table.

When you want to save multiple rows in database table then you pass a list of entity objects to JpaRepository or CrudRepository’s saveAll() method in order to save multiple entities or objects and this basically happens through Spring Data JPA Batch Insertion configuration. I also save single object using the save() method.

package com.roytuts.entity;
import java.io.Serializable;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;
@Entity
@Table(name = "EMPLOYEE")
public class Employee implements Serializable {
	private static final long serialVersionUID = 1L;
	@Id
	@Column(name = "EMPLOYEE_ID")
	private Integer empId;
	@Column(name = "EMPLOYEE_NAME")
	private String empName;
	//getters and setters
}

Spring Data JPA Repository

Here is the Spring Data JPA Repository interface. Here JpaRepository takes two parameters Employee object, i.e., entity object and primary key, i.e., Integer. You may have Long, String or any other class object as a primary key as well.

package com.roytuts.spring.data.jpa.batch.insertion.repository;

import org.springframework.data.jpa.repository.JpaRepository;

import com.roytuts.spring.data.jpa.batch.insertion.entity.Employee;

public interface EmployeeRepository extends JpaRepository<Employee, Long> {

}

Service Class

This is the service class that interacts with data layer as well as controller layer and acts as a mediator between them. This class generally handles all business logic.

In this service I will show you the example on Spring Data JPA Batch Insertion. Here notice how I am determining when to insert into database.

Here I iterate through list of employee objects and add to temporary Employee array list. Once I find counter equals to batch size(500) then I save those entity objects and at the same time I also clear the temp list because I don’t need those records in temp list any more.

Notice I have used @Transactional annotation in order to support Spring’s transaction management to rollback database insertion at any point of failures.

package com.roytuts.spring.data.jpa.batch.insertion.service;

import java.util.ArrayList;
import java.util.List;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;

import com.roytuts.spring.data.jpa.batch.insertion.entity.Employee;
import com.roytuts.spring.data.jpa.batch.insertion.repository.EmployeeRepository;

@Service
public class EmployeeService {

	@Autowired
	private EmployeeRepository employeeRepository;

	@Transactional
	public void saveEmployees(List<Employee> employees) {
		int size = employees.size();
		int counter = 0;

		List<Employee> temp = new ArrayList<>();

		for (Employee emp : employees) {
			temp.add(emp);

			if ((counter + 1) % 500 == 0 || (counter + 1) == size) {
				employeeRepository.saveAll(temp);
				temp.clear();
			}

			counter++;
		}
	}

}

Spring REST Controller

The Spring REST Controller class is resposible for handling requests and responses from clients. This holds all the REST services end-points. Using these end-points I would be able to get the JSON response.

Here I have only one end-point called /employees/save that saves a list of employees into database when you hit the URL http://localhost:8080/employees/save from REST client or Postman with a list of employee objects in JSON format as a body parameter.

package com.roytuts.spring.data.jpa.batch.insertion.controller;

import java.util.List;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RestController;

import com.roytuts.spring.data.jpa.batch.insertion.entity.Employee;
import com.roytuts.spring.data.jpa.batch.insertion.service.EmployeeService;

@RestController
public class EmployeeRestController {

	@Autowired
	private EmployeeService employeeService;

	@PostMapping("/employees/save")
	public ResponseEntity<Void> saveEmployees(@RequestBody List<Employee> employees) {
		employeeService.saveEmployees(employees);
		return new ResponseEntity<Void>(HttpStatus.OK);
	}

}

Spring Boot Main Class

Here is the application main class that is enough to start up the application in Spring Boot.

package com.roytuts.spring.data.jpa.batch.insertion;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class SpringDataJpabatchInsertionApp {

	public static void main(String[] args) {
		SpringApplication.run(SpringDataJpabatchInsertionApp.class, args);
	}

}

Testing the Application

Once you run the above main class and application gets start up, hit the URL http://localhost:8080/employees/save from REST client or Postman with list of employee objects as JSON and you should get the JSON response with OK.

The below youtube video shows the testing result.

That’s all. Hope you got an idea on Spring Data JPA Batch Insertion.

Source Code

Download

2 thoughts on “Spring Data JPA Batch Insertion

  1. Great Tutorial,
    Can you please tell me how to achieve the same functionality if i receive messages one by one (say from Azure service bus like around 500k) instead of a List?

Leave a Reply

Your email address will not be published. Required fields are marked *