This tutorial will show you an example on Spring Batch – TaskScheduler. So it will show you how to schedule the task repeatedly for reading a csv file data and writing to xml file after some modification on csv data. You can read the tutorial Spring Batch to read what is Spring Batch and what are the usages of Spring Batch.

Prerequisites

Eclipse, Gradle 4.10.2, JDK 1.8

Example with Source Code

We’ll build a service that imports data from a CSV file, transforms it with custom code, and store the final results in xml file. And schedule the same task repeatedly using Spring TaskScheduler.

Creating Project

Create gradle based project in Eclipse IDE and you will see the required project structure gets created.

Updating Build Script

Modify build.gradle file to include all required dependencies so that it looks like below. It downloads all the required jars from maven repository.

buildscript {
	ext {
		springBootVersion = '2.1.4.RELEASE'
	}
    repositories {
    	mavenLocal()
    	mavenCentral()
    }
    dependencies {
    	classpath("org.springframework.boot:spring-boot-gradle-plugin:${springBootVersion}")
    }
}
apply plugin: 'java'
apply plugin: 'org.springframework.boot'
sourceCompatibility = 1.8
targetCompatibility = 1.8
repositories {
	mavenLocal()
    mavenCentral()
}
dependencies {
	compile("org.springframework.boot:spring-boot-starter-batch:${springBootVersion}")
	compile("org.springframework:spring-oxm:5.1.7.RELEASE")
	compile('mysql:mysql-connector-java:5.1.13')
}

In the above build script we have added spring-oxm to get benefits of JAXB classes for generating XML file from Java POJO class.

Related Posts:

Creating VO Class

Create a model class Person.java which will represent a row of data for inputs and outputs. I have made the below class JAXB annotation enabled for converting Java object to XML file directly.

package com.roytuts.spring.batch.vo;
import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
@XmlRootElement(name = "person")
public class Person {
	private int id;
	private String firstName;
	private String lastName;
	@XmlAttribute(name = "id")
	public int getId() {
		return id;
	}
	public void setId(int id) {
		this.id = id;
	}
	@XmlElement(name = "firstName")
	public String getFirstName() {
		return firstName;
	}
	public void setFirstName(String firstName) {
		this.firstName = firstName;
	}
	@XmlElement(name = "lastName")
	public String getLastName() {
		return lastName;
	}
	public void setLastName(String lastName) {
		this.lastName = lastName;
	}
	@Override
	public String toString() {
		return "Person [id=" + id + ", firstName=" + firstName + ", lastName=" + lastName + "]";
	}
}

Creating FieldSetMapper Class

Create below mapper class which will map the CSV file row item to Java object.

package com.roytuts.spring.batch.fieldset.mapper;
import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import com.roytuts.spring.batch.vo.Person;
public class PersonFieldSetMapper implements FieldSetMapper<Person> {
	@Override
	public Person mapFieldSet(FieldSet fieldSet) {
		Person person = new Person();
		person.setId(fieldSet.readInt(0));
		person.setFirstName(fieldSet.readString(1));
		person.setLastName(fieldSet.readString(2));
		return person;
	}
}

Creating ItemProcessor Class

Create an intermediate processor. A common paradigm in batch processing is to ingest data, transform it, and then pipe it out somewhere else. Here we write a simple transformer that converts the initial characters of the names to uppercase.

package com.roytuts.spring.batch.itemprocessor;
import org.springframework.batch.item.ItemProcessor;
import com.roytuts.spring.batch.vo.Person;
public class PersonItemProcessor implements ItemProcessor<Person, Person> {
	@Override
	public Person process(Person person) throws Exception {
		System.out.println("Processing: " + person);
		final String initCapFirstName = person.getFirstName().substring(0, 1).toUpperCase()
				+ person.getFirstName().substring(1);
		final String initCapLastName = person.getLastName().substring(0, 1).toUpperCase()
				+ person.getLastName().substring(1);
		Person transformedPerson = new Person();
		transformedPerson.setId(person.getId());
		transformedPerson.setFirstName(initCapFirstName);
		transformedPerson.setLastName(initCapLastName);
		return transformedPerson;
	}
}

Creating CSV File

Create below CSV file under src/main/resources directory.

1000,soumitra,roy
1001,souvik,sanyal
1002,arup,chatterjee
1003,suman,mukherjee
1004,debina,guha
1005,liton,sarkar
1006,debabrata,poddar

Creating Configuration Class

We have created this Spring Configuration class to define several beans for Spring Batch processing.

We have defined beans, such as, ItemProcessor, TransactionManager, JobRepository, DataSource, JobLauncher, Step, Job etc. for our Spring Batch processing.

package com.roytuts.spring.batch.config;
import javax.sql.DataSource;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.core.launch.support.SimpleJobLauncher;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.batch.core.repository.support.JobRepositoryFactoryBean;
import org.springframework.batch.item.ItemProcessor;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider;
import org.springframework.batch.item.database.JdbcBatchItemWriter;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.batch.support.DatabaseType;
import org.springframework.batch.support.transaction.ResourcelessTransactionManager;
import org.springframework.beans.factory.config.BeanDefinition;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Scope;
import org.springframework.core.io.ClassPathResource;
import org.springframework.jdbc.datasource.DriverManagerDataSource;
import org.springframework.jdbc.datasource.init.DatabasePopulatorUtils;
import org.springframework.jdbc.datasource.init.ResourceDatabasePopulator;
import com.roytuts.spring.batch.itemprocessor.PersonItemProcessor;
import com.roytuts.spring.batch.vo.Person;
@Configuration
@EnableBatchProcessing
public class SpringBatchConfig {
	@Bean
	@Scope(value = BeanDefinition.SCOPE_PROTOTYPE)
	public Person person() {
		return new Person();
	}
	@Bean
	@Scope(value = BeanDefinition.SCOPE_PROTOTYPE)
	public ItemProcessor<Person, Person> itemProcessor() {
		return new PersonItemProcessor();
	}
	@Bean
	public DataSource dataSource() {
		DriverManagerDataSource dataSource = new DriverManagerDataSource();
		dataSource.setDriverClassName("com.mysql.jdbc.Driver");
		dataSource.setUrl("jdbc:mysql://localhost:3306/roytuts");
		dataSource.setUsername("root");
		dataSource.setPassword("");
		ResourceDatabasePopulator databasePopulator = new ResourceDatabasePopulator();
		databasePopulator.addScript(new ClassPathResource("org/springframework/batch/core/schema-drop-mysql.sql"));
		databasePopulator.addScript(new ClassPathResource("org/springframework/batch/core/schema-mysql.sql"));
		DatabasePopulatorUtils.execute(databasePopulator, dataSource);
		return dataSource;
	}
	@Bean
	public ResourcelessTransactionManager txManager() {
		return new ResourcelessTransactionManager();
	}
	@Bean
	public JobRepository jbRepository(DataSource dataSource, ResourcelessTransactionManager transactionManager)
			throws Exception {
		JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
		factory.setDatabaseType(DatabaseType.MYSQL.getProductName());
		factory.setDataSource(dataSource);
		factory.setTransactionManager(transactionManager);
		return factory.getObject();
	}
	@Bean
	public JobLauncher jbLauncher(JobRepository jobRepository) {
		SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
		jobLauncher.setJobRepository(jobRepository);
		return jobLauncher;
	}
	@Bean
	public BeanWrapperFieldSetMapper<Person> beanWrapperFieldSetMapper() {
		BeanWrapperFieldSetMapper<Person> fieldSetMapper = new BeanWrapperFieldSetMapper<>();
		fieldSetMapper.setPrototypeBeanName("person");
		return fieldSetMapper;
	}
	@Bean
	public FlatFileItemReader<Person> fileItemReader(BeanWrapperFieldSetMapper<Person> beanWrapperFieldSetMapper) {
		FlatFileItemReader<Person> fileItemReader = new FlatFileItemReader<>();
		fileItemReader.setResource(new ClassPathResource("person.csv"));
		DelimitedLineTokenizer delimitedLineTokenizer = new DelimitedLineTokenizer();
		delimitedLineTokenizer.setNames("id", "firstName", "lastName");
		DefaultLineMapper<Person> defaultLineMapper = new DefaultLineMapper<>();
		defaultLineMapper.setLineTokenizer(delimitedLineTokenizer);
		defaultLineMapper.setFieldSetMapper(beanWrapperFieldSetMapper);
		fileItemReader.setLineMapper(defaultLineMapper);
		return fileItemReader;
	}
	@Bean
	public JdbcBatchItemWriter<Person> jdbcBatchItemWriter(DataSource dataSource,
			BeanPropertyItemSqlParameterSourceProvider<Person> sqlParameterSourceProvider) {
		JdbcBatchItemWriter<Person> jdbcBatchItemWriter = new JdbcBatchItemWriter<>();
		jdbcBatchItemWriter.setDataSource(dataSource);
		jdbcBatchItemWriter.setItemSqlParameterSourceProvider(sqlParameterSourceProvider);
		jdbcBatchItemWriter.setSql("insert into person(id,firstName,lastName) values (:id, :firstName, :lastName)");
		return jdbcBatchItemWriter;
	}
	@Bean
	public BeanPropertyItemSqlParameterSourceProvider<Person> beanPropertyItemSqlParameterSourceProvider() {
		return new BeanPropertyItemSqlParameterSourceProvider<>();
	}
	@Bean
	public Job jobCsvMysql(JobBuilderFactory jobBuilderFactory, Step step) {
		return jobBuilderFactory.get("jobCsvMysql").incrementer(new RunIdIncrementer()).flow(step).end().build();
	}
	@Bean
	public Step step1(StepBuilderFactory stepBuilderFactory, ResourcelessTransactionManager transactionManager,
			ItemReader<Person> reader, ItemWriter<Person> writer, ItemProcessor<Person, Person> processor) {
		return stepBuilderFactory.get("step1").transactionManager(transactionManager).<Person, Person>chunk(2)
				.reader(reader).processor(processor).writer(writer).build();
	}
}

A default simple implementation of the Job interface is provided by Spring Batch in the form of the SimpleJob class which creates some standard functionality on top of Job, however the batch namespace abstracts away the need to instantiate it directly.

Step is a domain object that encapsulates an independent, sequential phase of a batch job. Therefore, every Job is composed entirely of one or more steps. A Step contains all of the information necessary to define and control the actual batch processing.

ItemReader is an abstraction that represents the retrieval of input for a Step, one item at a time.

ItemWriter is an abstraction that represents the output of a Step, one batch or chunk of items at a time. Generally, an item writer has no knowledge of the input it will receive next, only the item that was passed in its current invocation.

ItemProcessor is an abstraction that represents the business processing of an item. While the ItemReader reads one item, and the ItemWriter writes them, the ItemProcessor provides access to transform or apply other business processing. If, while processing the item, it is determined that the item is not valid, returning null indicates that the item should not be written out.

TransactionManager – Spring’s that will be used to begin and commit transactions during processing.

Chunk – The number of items that will be processed before the transaction is committed.

JobRepository is the persistence mechanism. It provides CRUD operations for JobLauncherJob and Stepimplementations. When a Job is first launched, a JobExecution is obtained from the repository, and during the course of execution StepExecution and JobExecution implementations are persisted by passing them to the repository.

JonLauncher represents a simple interface for launching a Job with a given set of JobParameters.

Creating Spring Task Scheduler

We need to create Spring Task Scheduler to schedule the task repetitively for execution.

We schedule task using cron expression. So the below class will execute the job every 10 seconds.

package com.roytuts.spring.batch.task.scheduler;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.scheduling.annotation.EnableScheduling;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;
@Component
@EnableScheduling
public class SpringBatchTaskScheduler {
	@Autowired
	private Job job;
	@Autowired
	private JobLauncher jobLauncher;
	@Scheduled(cron = "*/10 * * * * *")
	public void run() {
		try {
			JobExecution execution = jobLauncher.run(job,
					new JobParametersBuilder().addLong("timestamp", System.currentTimeMillis()).toJobParameters());
			System.out.println("Job Status : " + execution.getStatus());
		} catch (Exception ex) {
			ex.printStackTrace();
		}
		System.out.println("Done");
	}
}

Creating Main Class

Create below class for launching spring batch job.

package com.roytuts.spring.batch;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication(scanBasePackages = "com.roytuts.spring.batch")
public class SpringBatch {
	public static void main(String[] args) {
		SpringApplication.run(SpringBatch.class, args);
	}
}

Testing the Application

Run the above class, you will see the below output.

 11:37:41.197  INFO 16644 --- [           main] o.s.j.d.e.EmbeddedDatabaseFactory        : Starting embedded database:
 11:37:43.540  INFO 16644 --- [           main] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=jobCsvXml]] launched with the following parameters: [{run.id=1}]
 11:37:43.580  INFO 16644 --- [           main] o.s.batch.core.job.SimpleStepHandler     : Executing step: [step1]
Processing: Person [id=1000, firstName=soumitra, lastName=roy]
Processing: Person [id=1001, firstName=souvik, lastName=sanyal]
Processing: Person [id=1002, firstName=arup, lastName=chatterjee]
Processing: Person [id=1003, firstName=suman, lastName=mukherjee]
Processing: Person [id=1004, firstName=debina, lastName=guha]
Processing: Person [id=1005, firstName=liton, lastName=sarkar]
Processing: Person [id=1006, firstName=debabrata, lastName=poddar]
 11:37:43.769  INFO 16644 --- [           main] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=jobCsvXml]] completed with the following parameters: [{run.id=1}] and the following status: [COMPLETED]
 11:37:50.011  INFO 16644 --- [   scheduling-1] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=jobCsvXml]] launched with the following parameters: [{timestamp=1558764470001}]
 11:37:50.026  INFO 16644 --- [   scheduling-1] o.s.batch.core.job.SimpleStepHandler     : Executing step: [step1]
Processing: Person [id=1000, firstName=soumitra, lastName=roy]
Processing: Person [id=1001, firstName=souvik, lastName=sanyal]
Processing: Person [id=1002, firstName=arup, lastName=chatterjee]
Processing: Person [id=1003, firstName=suman, lastName=mukherjee]
Processing: Person [id=1004, firstName=debina, lastName=guha]
Processing: Person [id=1005, firstName=liton, lastName=sarkar]
Processing: Person [id=1006, firstName=debabrata, lastName=poddar]
 11:37:50.105  INFO 16644 --- [   scheduling-1] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=jobCsvXml]] completed with the following parameters: [{timestamp=1558764470001}] and the following status: [COMPLETED]
Job Status : COMPLETED
Done
 11:38:00.021  INFO 16644 --- [   scheduling-1] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=jobCsvXml]] launched with the following parameters: [{timestamp=1558764480011}]
 11:38:00.039  INFO 16644 --- [   scheduling-1] o.s.batch.core.job.SimpleStepHandler     : Executing step: [step1]
Processing: Person [id=1000, firstName=soumitra, lastName=roy]
Processing: Person [id=1001, firstName=souvik, lastName=sanyal]
Processing: Person [id=1002, firstName=arup, lastName=chatterjee]
Processing: Person [id=1003, firstName=suman, lastName=mukherjee]
Processing: Person [id=1004, firstName=debina, lastName=guha]
Processing: Person [id=1005, firstName=liton, lastName=sarkar]
Processing: Person [id=1006, firstName=debabrata, lastName=poddar]
 11:38:00.096  INFO 16644 --- [   scheduling-1] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=jobCsvXml]] completed with the following parameters: [{timestamp=1558764480011}] and the following status: [COMPLETED]
Job Status : COMPLETED
Done
 11:38:10.009  INFO 16644 --- [   scheduling-1] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=jobCsvXml]] launched with the following parameters: [{timestamp=1558764490001}]
 11:38:10.021  INFO 16644 --- [   scheduling-1] o.s.batch.core.job.SimpleStepHandler     : Executing step: [step1]
Processing: Person [id=1000, firstName=soumitra, lastName=roy]
Processing: Person [id=1001, firstName=souvik, lastName=sanyal]
Processing: Person [id=1002, firstName=arup, lastName=chatterjee]
Processing: Person [id=1003, firstName=suman, lastName=mukherjee]
Processing: Person [id=1004, firstName=debina, lastName=guha]
Processing: Person [id=1005, firstName=liton, lastName=sarkar]
Processing: Person [id=1006, firstName=debabrata, lastName=poddar]
 11:38:10.068  INFO 16644 --- [   scheduling-1] o.s.b.c.l.support.SimpleJobLauncher      : Job: [FlowJob: [name=jobCsvXml]] completed with the following parameters: [{timestamp=1558764490001}] and the following status: [COMPLETED]
Job Status : COMPLETED
Done
...
...

In the above output you see the job name, step name and also which row item from csv file is being processed.

You see also from the above output that the step1 has been executed repeatedly until you stop the execution of the task.

You see the job is executed repeatedly every 10 seconds.

If you use other than in-memory database, such as, MySQL, Oracle etc. then you can also see the SQL scripts have been executed and below tables have been created in the MySQL database with job details.

spring batch taskscheduler

You will also see the batch_job_execution table has been populated with the execution status and timestamp.

spring batch task scheduler

Source Code

You can download source code.

Thanks for reading.

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *