spring-data-jpa-r2dbc-mysql-stream-million-records
In this project, we will implement two Spring Boot
Java Web application called, streamer-data-jpa
and streamer-data-r2dbc
. They both will fetch 1 million of customer's data from MySQL
and stream them to Kafka
. The main goal is to compare the application's performance and resource utilization.
Applications
-
streamer-data-jpa
Spring Boot
Web Java application that connects toMySQL
usingSpring Data JPA
and toKafka
.It provides some endpoints such as:
PATCH api/customers/stream-naive[?limit=x]
: to stream customer records using a naive implementation withSpring Data JPA
;PATCH api/customers/stream[?limit=x]
: to stream customer records using a better implementation withJava 8 Streams
andSpring Data JPA
as explained in this article.PATCH api/customers/load?amount=x
: to create a specific amount of random customer records.
-
streamer-data-r2dbc
Spring Boot
Web Java application that connects toMySQL
usingSpring Data R2DBC
and toKafka
.It provides some endpoints such as:
PATCH api/customers/stream[?limit=x]
: to stream customer records;PATCH api/customers/load?amount=x
: to create a specific amount of random customer records.
Prerequisites
Start Environment
-
Open a terminal and inside
spring-data-jpa-r2dbc-mysql-stream-million-records
root folder rundocker-compose up -d
-
Wait for Docker containers to be up and running. To check it, run
docker-compose ps
-
Once
MySQL
,Kafka
andZookeeper
are up and running, run the following scripts-
To create two
Kafka
topics./init-kafka-topics.sh
-
To initialize
MySQL
database and to create twoKafka
topics./init-mysql-db.sh 1M
Note: we can provide the following load amount values: 0, 100k, 200k, 500k or 1M
-
Run applications with Maven
Inside spring-data-jpa-r2dbc-mysql-stream-million-records
, run the following Maven commands in different terminals
-
streamer-data-jpa
./mvnw clean spring-boot:run --projects streamer-data-jpa
-
streamer-data-r2dbc
./mvnw clean spring-boot:run --projects streamer-data-r2dbc
Run applications as Docker containers
-
Build Docker Images
- In a terminal, make sure you are in
spring-data-jpa-r2dbc-mysql-stream-million-records
root folder - Run the following script to build the Docker images
./docker-build.sh
- In a terminal, make sure you are in
-
Environment Variables
-
streamer-data-jpa
Environment Variable Description MYSQL_HOST
Specify host of the MySQL
database to use (defaultlocalhost
)MYSQL_PORT
Specify port of the MySQL
database to use (default3306
)KAFKA_HOST
Specify host of the Kafka
message broker to use (defaultlocalhost
)KAFKA_PORT
Specify port of the Kafka
message broker to use (default29092
) -
streamer-data-r2dbc
Environment Variable Description MYSQL_HOST
Specify host of the MySQL
database to use (defaultlocalhost
)MYSQL_PORT
Specify port of the MySQL
database to use (default3306
)KAFKA_HOST
Specify host of the Kafka
message broker to use (defaultlocalhost
)KAFKA_PORT
Specify port of the Kafka
message broker to use (default29092
)
-
-
Start Docker Containers
Run the following
docker run
commands in different terminals-
streamer-data-jpa
docker run --rm --name streamer-data-jpa -p 9080:9080 \ -e MYSQL_HOST=mysql -e KAFKA_HOST=kafka -e KAFKA_PORT=9092 \ --network spring-data-jpa-r2dbc-mysql-stream-million-records_default \ ivanfranchin/streamer-data-jpa:1.0.0
-
streamer-data-r2dbc
docker run --rm --name streamer-data-r2dbc -p 9081:9081 \ -e MYSQL_HOST=mysql -e KAFKA_HOST=kafka -e KAFKA_PORT=9092 \ --network spring-data-jpa-r2dbc-mysql-stream-million-records_default \ ivanfranchin/streamer-data-r2dbc:1.0.0
-
Simulation with 1 million customer records
Previously, during Start Environment step, we initialized MySQL with 1 million customer records.
Resource Consumption Monitoring Tool
-
Running applications with Maven
We will use
JConsole
tool. In order to run it, open a new terminal and runjconsole
-
Running applications as Docker containers
We will use
cAdvisor
tool. In a browser, access- http://localhost:8080/docker/ to explore the running containers;
- http://localhost:8080/docker/container-name to go directly to the info of a specific container.
Streaming customer records
In another terminal, call the following curl
commands to trigger the streaming of customer records from MySQL
to Kafka
. At the end of the curl
command, the total time it took (in seconds) to process will be displayed.
We can monitor the amount of messages and the messages themselves been streamed using Kafdrop – Kafka Web UI at http://localhost:9000
-
streamer-data-jpa
Naive implementation
curl -w "Response Time: %{time_total}s" -s -X PATCH localhost:9080/api/customers/stream-naive
Better implementation
curl -w "Response Time: %{time_total}s" -s -X PATCH localhost:9080/api/customers/stream
-
streamer-data-r2dbc
curl -w "Response Time: %{time_total}s" -s -X PATCH localhost:9081/api/customers/stream
Sample
A simulation sample running the applications with Maven and using JConsole
tool
-
streamer-data-jpa
Naive implementation
Response Time: 414.486126s
Better implementation
Response Time: 453.692525s
-
streamer-data-r2dbc
Response Time: 476.951654s
Useful commands & links
-
Kafdrop
Kafdrop
can be accessed at http://localhost:9000 -
MySQL monitor
To check data in
customerdb
databasedocker exec -it -e MYSQL_PWD=secret mysql mysql -uroot --database customerdb SELECT count(*) FROM customer;
To create a dump from
customer
table incustomerdb
database, make sure you are inspring-data-jpa-r2dbc-mysql-stream-million-records
root folder and run./dump-mysql-db.sh
Shutdown
- To stop
streamer-data-jpa
andstreamer-data-r2dbc
, go to the terminals were they are running and pressCtrl+C
- To stop and remove docker-compose containers, network and volumes, go to a terminal and, inside
spring-data-jpa-r2dbc-mysql-stream-million-records
root folder, run the command belowdocker-compose down -v
Cleanup
To remove all Docker images created by this project, go to a terminal and, inside spring-data-jpa-r2dbc-mysql-stream-million-records
root folder, run the following script
./remove-docker-images.sh