Spring Batch


  • Spring Batch is a lightweight, comprehensive batch framework
  • It is designed to enable the development of robust batch applications
  • It builds on the productivity, POJO-based development approach
  • Spring Batch is not a scheduling framework
  • It is intended to work in conjunction with a scheduler but not a replacement for a scheduler.

Usages of Spring Batch

  • used to perform business operations in mission critical environments
  • used to automate the complex processing of large volume of data without user interaction
  • processes the time-based events, periodic repetitive complex processing for a large data sets
  • used to integrate the internal/external information that requires formatting, validation and processing in a transactional manner
  • used to process the parallel jobs or concurrent jobs
  • provide the functionality for manual or scheduled restart after failure

Guidelines to use Spring Batch

  • avoid building complex logical structures in a single batch application
  • keep your data close to where the batch processing occurs
  • minimize the system resource use like I/O by performing operations in internal memory wherever possible
  • cache the data after first read from database for every transaction and read cache data from next time onwards
  • avoid unnecessary scan for table or index in database
  • be specific to retrieve the data from database, i.e., retrieve the required fields only, specify WHERE clause in the SQL statement etc.
  • avoid performing the same thing multiple times in a batch processing
  • allocate enough memory before batch process starts because reallocating memory is a time-consuming matter during the batch process
  • be consistent to check and validate the data to maintain the data integrity
  • Implement check-sums for internal validation wherever possible
  • stress test should be executed at early stage for production-like environments

For more information on Theoretical parts please go to http://docs.spring.io/spring-batch/trunk/reference/html/spring-batch-intro.html and http://spring.io/guides/gs/batch-processing/

Now we will see an example how it works

What we will do

We’ll build a service that imports data from a CSV spreadsheet, transforms it with custom code, and stores the final results in another CSV spreadsheet. You can also store data in database or any persistence storage.


Any Java based IDE
JDK 1.6+
Maven 3.0+

Step 1. Create Maven project(standalone or quickstart) in Eclipse IDE and necessary project structure gets created

Step 2. Modify pom.xml file so that it looks like below. It downloads all jars from maven repository.


Step 3. Create a business class User.java which will represent a row of data for inputs and outputs. You can instantiate the User class either with name and email through a constructor, or by setting the properties.

Step 4. Create an intermediate processor. A common paradigm in batch processing is to ingest data, transform it, and then pipe it out somewhere else. Here we write a simple transformer that converts the names to uppercase and changes the email domain.

UserItemProcessor implements Spring Batch’s ItemProcessor interface. This makes it easy to wire the code into a batch job that we define further down in this guide. According to the interface, we receive an incoming User object, after which we transform name to an upper-cased name and we replace the email domain by roytuts.com in User object.

Step 5. Now we will write a batch job. We use annotation @EnableBatchProcessing for enabling memory-based batch processing meaning when processing is done, the data is gone.

Step 6. This batch processing can be embedded in web apps also but here we will create a main method to run the application. You can also create an executable jar from it.

Step 7. Create CSVUtils.java and UserFieldSetMapper.java



That’s all. Thanks for your reading.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.