Distributed Scheduling
In Spring Boot, if we need to execute tasks at a scheduled time, we can use @Scheduled annotation. However, if there are multiple instances of the application and if we want to ensure that a scheduled task runs only once across all instances, in such case we need distributed scheduling.
There are different libraries that support distributed scheduling such as ShedLock , Quartz , Akka Scheduler etc.
ShedLock
ShedLock is a library which is used for distributed locking in scheduled tasks. It ensures that scheduled tasks (such as cron jobs) are executed only once in a distributed environment, even if the application is running on multiple nodes. It uses an external storage mechanism (like a database or cache) to keep track of running schedulers and acquired locks. ShedLock supports multiple lock providers like JDBC, Redis, Zookeeper etc.
How Shedlock Works
Lock Provider: Shedlock relies on a lock provider, which is a bean configured in the Spring Boot application. This bean depends on the type of storage chosen (e.g., database, Redis).
Acquiring the Lock: When a scheduled task is due for execution, each Spring Boot instance tries to acquire a lock for that task through the lock provider.
Shared Storage: Shedlock uses the lock provider to interact with the chosen storage system (e.g., database table or Redis key). If no other instance has already acquired the lock, the current instance can lock it for the task's execution duration.
Task Execution: Once a lock is acquired, only that specific instance proceeds to execute the scheduled task.
Releasing the Lock: After successful task execution, the lock is released using the lock provider, allowing other instances to potentially acquire it for the next scheduled run.
For more details, visit the site - https://github.com/lukas-krecan/ShedLock
Use cases
Handling Recurring Jobs with External Dependencies:
Generating and Sending Reports: If there is a task that involves fetching data from external APIs, generating reports, and sending them via email, scheduler locks ensure only one instance sends those emails, preventing duplicates and potential overload.
External Data Updates: Tasks that pull data from external sources and update your database can benefit from scheduler locks to avoid conflicting updates and ensure consistency.
Long-Running Tasks with Progress Updates:
Data Processing and Transformation: For tasks that involve processing large datasets or lengthy file operations, scheduler locks prevent multiple instances from starting the same work, saving resources and ensuring correct completion.
Batch Updates: Tasks that perform batch updates or data migrations can leverage scheduler locks to prevent conflicts and ensure data integrity.
Example: Using JDBC (mysql) as a lock provider.
Add the required dependency in pom.xml file.
Access the mysql database instance via mysql workbench and create the table manually which is to be used by the SchedulerLock.
Create Scheduler lock configuration class ShedlockConfig.java
Create ScheduledTasks.java file which has the logic to be performed at the scheduled time.
Enable Scheduling and Scheduler lock in the main application.java file.
Add the database related properties in the application.yml file
Run the application and verify the logs and shedlock table content.
Last updated
Was this helpful?