Cloud-Native Batch and Streaming Applications

Amitpal Singh Dhillon
5 min readAug 19, 2021

Batch applications run without end-user interaction, in headless mode, and even as scheduled tasks. These applications are responsible for high-volume, repetitive tasks which include: end of day payment reports, offline batch-based, end-of-day processing, bank statement reconciliation checks, risk calculations, tighter control on when to start the processing and other non-interactive tasks that must complete reliably within certain business deadlines.

Photo by Giuliana Bezerra

There are several challenges to maintaining such applications including multiple program restarts, high CPU and memory utilization costs. As such, there is a need to massively speed up banking operations through end-to-end real-time transactions.

Optimize and Accelerate Batch Processing

Using the combination of GraalVM, Spring batch, Micronaut and Oracle Cloud Infrastructure its possible to optimize and accelerate these applications.

Here is a sample Spring Batch application used for the test compiled to GraalVM Native Image and deployed on Oracle Cloud Infrastructure

Performance Outcome

This is the native-image build process for the batch application, some highlights of the steps are shown.

[opc@instance-20201214-2208 batch]$ mvn spring-boot:build-image
[INFO] Building batch 0.0.1-SNAPSHOT [INFO] [creator] Executing native-image
-H:+StaticExecutableWithDynamicLibC -H:Name=/layers/paketo-buildpacks_native-image/native-image/com.example.batch.BatchApplication
-cp /workspace:/workspace/BOOT-INF/classes:/workspace/BOOT-INF/lib/spring-native-0.10.2-SNAPSHOT.jar[output:11731]
... [INFO] [creator] [/layers/paketo-buildpacks_native-image/native-image/com.example.batch.BatchApplication:132] classlist: 7,824.35 ms, 1.73 GB [INFO] [creator] [/layers/paketo-buildpacks_native-image/native-image/com.example.batch.BatchApplication:132] (cap): 834.54 ms, 2.37 GB [INFO] [creator] [/layers/paketo-buildpacks_native-image/native-image/com.example.batch.BatchApplication:132] setup: 4,362.58 ms, 2.37 GB
... [INFO] [creator] # Printing build artifacts to: com.example.batch.BatchApplication.build_artifacts.txt [INFO] [creator] [/layers/paketo-buildpacks_native-image/native-image/com.example.batch.BatchApplication:132] [total]: 404,477.86 ms, 6.55 GB
# Printing build artifacts to: output.build_artifacts.txt.

The Cloud Native batch application produced by the GraalVM Enterprise Native Image starts up in near real-time and launches the batch jobs and initial requests within 1 ms.

[opc@instance-20201214-2208 batch]$ docker-compose upbatch_1  | 2021-08-18 07:15:45.874  INFO 1
[main] com.example.batch.BatchApplication:
Starting BatchApplication using Java 1.8.0_292 on e069750ffacb with PID 1 (/workspace/com.example.batch.BatchApplication started by cnb in /workspace)
batch_1 | 2021-08-18 07:15:45.916 INFO 1
[main] com.example.batch.BatchApplication:
Started BatchApplication in 0.056 seconds (JVM running for 0.058)
batch_1 | 2021-08-18 07:15:45.919 INFO 1
[main] o.s.b.c.l.support.SimpleJobLauncher:Job:[SimpleJob[name=job]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 2ms
[opc@instance-20201214-2208 ~]$ ps -o rss 1 | tail -n1
[opc@instance-20201214-2208 ~]$ bc <<< "scale=1; 10324/1024"
RSS(Memory) 10MB
[opc@instance-20201214-2208 batch]$ docker-compose imagesContainer Repository Tag Image Id Size --------------------------------------------------------------------batch_1 batch 0.0.1-SNAPSHOT 069d3320b6f0 89.04 MB

With the GraalVM Enterprise Native Image, the startup and execution of the batch job drastically improve, meaning it's about 30x faster (time) and approximately 20x reduced memory (RSS) compared to the default JDK. The application size for the native image container size is 80+MB which is 50% smaller than the default JDK at16MB (for the jar file) + 150MB(for the full JDK). So three wins in a row with the cloud-native batch application.

[opc@instance-20201214-2208 target]$ java -jar batch-0.0.1-SNAPSHOT.jar2021-08-18 07:19:46.947  INFO 25425
[main] com.example.batch.BatchApplication:
Starting BatchApplication v0.0.1-SNAPSHOT using Java 11.0.12 on instance-20201214-2208 with PID 25425 (/home/opc/spring-native/samples/batch/target/batch-0.0.1-SNAPSHOT.jar started by opc in /home/opc/spring-native/samples/batch/target)
2021-08-18 07:19:48.918 INFO 25425
[main] com.example.batch.BatchApplication:
Started BatchApplication in 2.652 seconds (JVM running for 3.392)
2021-08-18 07:19:49.091 INFO 25425
[main] o.s.b.c.l.support.SimpleJobLauncher:Job:[SimpleJob[name=job]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 68ms
[opc@instance-20201214-2208 ~]$ ps -o rss 25425 | tail -n1
[opc@instance-20201214-2208 ~]$ bc <<< "scale=1; 223780/1024"
RSS(Memory) 218.5MB
[opc@instance-20201214-2208 target]$ ls -lh
jar batch-0.0.1-SNAPSHOT.jar (Size 16 MB)

Command-line CLI Applications

In certain cases, you may wish to create standalone command-line (CLI) applications that interact with your Microservice infrastructure. Examples of applications like this include scheduled tasks, batch applications, and general command-line applications.

Figure: Micronaut CLI Application via VSCode

You can create a Micronaut CLI application via Visual Studio Code and add support for Picocli (command line parser that supports usage help autocomplete, nested subcommands, and an annotations API to create command-line applications). Micronaut Data 3.0 recently had some significant updates, repositories now support batch insert, update and delete even with a custom query.

Here is a real-world weather application with GraalVM Native Image, Micronaut, and Picocli integration. On another note, Helidon and JBatch can be used together to execute batch jobs in environments.

Photo from Fintechna Alfredo Muñoz Rios

Summary

Looking to the future, banks and telco’s are transforming their core systems to support a hybrid bridge between the two worlds of batch and stream processing, for example for tasks like detecting fraudulent transactions that can be identified and stopped before they are even complete. For such event-driven applications, due to the huge amount of data that is being generated and moved, its important to manage memory (footprint) costs, keep the flow of data through the system at the highest optimal level (throughput) by using a cloud-native architecture: Micronaut with Kafka and GraalVM Enterprise.

References

There are several customers who could benefit from this,

A manufacturing and wholesale distribution company, that is doing a manual extract/batch file processing using Lotus Notes and were looking for a fast and simple cloud-native integration capability.

A manufacturer and distributor of electricity and gas need to integrate batch HR information coming from Fusion HCM with their SAP central HR system using a cloud-native framework.

Audit records are available through an authenticated, filterable query API or can be retrieved as batched files from Oracle Cloud Infrastructure Object Storage.

Moving an Electric company’s integrations to Cloud, supporting real-time and batch data transformations with full visibility and automation reducing cost.

A large IT Services company, managing 100’s of applications that are using LDAP and synchronizing batch processes in offline and online mode.

A large financial services company having to maintain batch processes causing high CPU usage. A possible solution to substantially reduce a long-running batch workload (from 30 min to under 15 min ~2x faster) combined with 25% less CPU usage for their risk calculation framework deployed in a cluster. Resulting in faster, more efficient applications.

--

--

Amitpal Singh Dhillon

vCISO, previously, from Oracle Inc, Sourcefire, Cisco Systems, and Applied Materials.