► What should I test next? ► AWS is expensive - Infra Support Fund: buymeacoffee.com/antonputra ► Benchmarks: th-cam.com/play/PLiMWaCMwGJXmcDLvMQeORJ-j_jayKaLVn.html&si=p-UOaVM_6_SFx52H
Are you sure that both apps are using same amount of connections in pool? Connection pool often makes most of the performance diff in this kind of benchmark. Quarkus defaults to 20 concurrent connections, and pgpool to 4 or runtime.NumCPU() from what I have read. Have you check performance for more than 20 connections in a pool?
Perhaps you could also try helidon SE instead of quarkus. Helidon was built from the ground up by Oracle labs to use the latest java tech like virtual threads to serve requests
@@rajivkumar-ub6uj i think so, the easiest way is just package everything as docker compose or perhaps just use local minuke cluster. i'll think about it
I am go fanboy but I really like applications written in Quarkus. My first language was Java and it is mind-blowing how fast and light Quarkus feels compared to Spring
You would be surprised how far spring came his way. With that being said, for long running app spring boot as none aot compiled would remain faster thanks to the jit compiler. Quarkus is really only good of you need fast startup or low ram consumption
This second test scenario is absolute perfection in testing real world applications. It's easy to get excited about a performance difference of like 400% (for example) in a synthetic benchmark, but by including database, storage and (de)serializing, it gives a much more nuanced picture of how it would actually scale and perform. In this case I would say both applications performed well and comparable. I'd be interested in a bit of a deeper dive in these applications by including opentelemetry and seeing what functions might bottleneck.
Thanks. Well, in some tests, I used OpenTelemetry clients with this Prometheus client in both Go and Java. I'm wondering what else you would instrument besides these function calls to S3 and the database. I might include it in the following videos. example - github.com/antonputra/tutorials/blob/main/lessons/201/go-app/images.go#L50-L62
I try to improve each time I create benchmarks. Next time, I will definitely use the v2 Go SDK and apply some other recommendations from your side. Thank you for taking the time to leave this feedback.
@@AntonPutra glad I could help and you haven't taken it as a personal attack or something :D I really value your videos and open source code for every video! Looking forward to seeing more of them.
@@DillPL thank you! i actually implemented your suggested idea in the new video and reduced the size by 6 mb (45 -> 39) :) - th-cam.com/video/56TUfwejKfo/w-d-xo.html will try other tips next as well and finally update that sdk lol
this is so professional! I love it! please do bun vs deno v2 since deno has gotten npm compatibility , the only difference now between bun and deno (aside from being written in zig and rust) is the speed (I think , both have gotten very nice std library) please do a benchmark comparing everything!
Excelent video, thank you so much! Do you think you could show this same benchmark comparing quarkus and go pushing both to its limits as you've done in your other videos?? I'm really curious about how they both compare on performance under heavy loads. My guess is quarkus would break first but I wonder how big is the difference. (I'd expect both are kinda close) Excelent work, and I'm very excited for your future videos!
@@AntonPutra your current test scenarios are very good so I wouldn't change anything. Regarding C# I would use LTS version (dotnet 8) which is the fastest one amongst other versions according to Microsoft.
Nice work! The explanation around the benchmark is easy to understand and full of information there. IMHO, you should start to build your own courses on Udemy :)
Great video. Did you use the Reactive version of Java or the OS Thread version? If the Blocking version it would be interesting to see what would happen if you ran those on virtual threads. And what were the results of the reactive version.
@@AntonPutra I did. There are 2 versions and it looks like the regular hardware thread version was used. Since these workloads are all IO it would be better to use virtual threads (annotate with @RunOnVirtualThread) or the reactive version. But maybe the reactive version was used. I have always seen better performance on Quarkus than Go so these results are a bit surprising.
Very nice video! Seems like if cost-cutting is of great concern, you'd lean towards go to keep CPU utilization down. I would love to see a similar comparison video between Java and JavaScript/Node.js.
Interestingly, in your Test scenario 2, your Quarkus app is spiking in DB latency while having constant times in between, as if the Postgres client would be idling to gather the queries (or waiting on a lock?) and send them in bursts.
Okay, you got me now. I will start trying prometheus and grafana. The question I have is which tools do you use for load testing? You are using word "client" for this. I assume you use some kind of tools, like jmeter, k6 or?
Great video! The test seems a bit unfair to Java, as much of the time was likely spent serializing JSON, which can vary significantly based on the library used. It would be interesting to see results with a more complex application that uses intricate data structures, as Go’s garbage collector introduces pauses and latency, while Java is known for its efficiency in these scenarios. Now, on the topic of environments and the test setup, I have a few questions. Is it statistically valid to compare a test run on Kubernetes with minimal requests? For the comparison to hold up with few requests, each application's runtime environment would need to be exactly the same, which can be difficult to guarantee given the cluster load and where each pod is deployed. It might actually be easier to run the test on your local machine, where you can control the resource allocation. However, I noticed you’re using a Mac, so factors like virtualization might make the results diverge even further from what you’d see in a production environment
Go has gc too... The explanation is because java uses a fixed heap. Normal java reserves the memory from the system upfront and you will see no change for all the run. The Quarkus optimizations makes the internal HEAP metrics visible to K8s. But the particularities of java are still visible in those behaviours as defaulting to reserve a lot of memory upfront.
@@framegrace1 the java GC could be different at different jvm implementations. But basically it works by simple principle. The jvm perform gc then it see that heap is overused. It based on heap limit. So, in this case jvm application started and used some heap. The heap usage isn't reached the GC limit - so don't need to perform gc. When traffic comes to jvm application - it increases the count of created objects and as consequence - increased heap usage. And when heap usage limit is reached then jvm perform gc and all objects created at start of application has been deleted. I don't know how GO gc works and looks like it has partially different implementation.
There was a non-blocking Netty server implemented with Spring Reactive Web, which is more efficient. for databae approach use R2DBC the reactive nonblocking data repository. btw spring also support graalvm and it is not outdated.
Seems like Java 21 was used but Virtual threads wasn't used for the Quarkus application. Wasn't that the whole point to using newer Java version with the performance improvements and non-blocking reactivity APIs?
@@AntonPutrahey, thanks for making this video. Just that needed to point out the code looks to be done in a older/traditional method even though Quarkus has annotations that resolves traditional blocking calls that modern programming languages like Go probably already has underlying. Great detailed video as always! Maybe I'll try this out on my local machine to test out too!
Virtual Threads are not better in performance compared to fully reactive code. Quarkus fully reactive or Spring fully reactive will always beat virtual threads, both in performance and resources usage.
hey good test! can you test with go-chi instead of Fiber? go-chi is more optimized in terms of memory usage so that might explains why Java was using less memory in that first test. Overall, good video! Keep it up
good test after previous tries ;) but I would not accentuate memory consumption at the start of compiled java, as it does not affect anything. Also it looks like cpu doesn't do anything, so no reason to seriously compare 3% with 5%. But latency values are valuable! PS: looking at the low cpu consumption test I got an idea to test cpu intensive application. Try to create something like a redis (hashmap is fast, lets use treemap and its concurrent versions), the app will add, update and get some data, for example count of values that are greater than received in a controller. PPS: interesting to see how regular java 21 works with virtual threads, but I heard that java file io on linux is synchronous and only 22 will be modern, so it could be a reason why you got these values in the current test. Also testing regular java in a container is tricky, it’s better to test different Xmx-Xms values first, I mean starting java under the memory limit of 500mb is not the same as with 2000mb (so using compiled java leaves that headache, but compiled has lower throughput and latency :) )
Interesting. These tests could be extended to compare Hotspot VM, Generational ZGC and a few other switches. Can you make a video of your entire testing setup (focusing on docker, kubernetes, prometheus and grafana) from scratch? I think it's totally worth it.
Nice comparison! Though I wouldn't ever compare Go vs Java native for long time runners as this one. It is true that the metrics of java at startup time would be much (much) worst but bear in mind that java has been built thinking in the startup as an edge case scenario and the JVM does a lot optimisations while the program is running, it would be interesting if java is able to beat Go in the long run. In the short run I think that there is no possible discussion and Java Native is just a work around.
C# vs Go would be amazing to see. Maybe add to test 2 some simple reads from the database, and maybe add test 3 with some simple data structure or general purpose calculations to see how well each language performs. Amazing content. I am currently writing a high performance C# application for the government with .Net 8 and it is incredibly fast. I wonder if .net 8 has been improved so much that might even beat Java at this point.
Can you elaborate on using simple data structures or general-purpose calculations to evaluate how well each language performs? I don't really want to run fibonacci anymore lol
@@AntonPutra You can try to implement 3 different types of algorithms (In addition to the 2 tests you already did in your previous video). 1- Searching Algorithms (linear search) - ex: Create a List of 1 million objects (person: {Id, Name} - Id: must be unique integer 1 to 1,000,000. Name: generate random string. Populate your list with 1 objects (Person). Test: Generate a random Int value from 1 to 1million and find the Person object (by ID) using the random generated Int, and get the Name, then find the object in the list (by Name) compare and validate both ID match 2- Sorting Algorithms (sort the entire 1 million object (person) by name. 3 - File I/O Operations - generate random Int value from 1 to 1 million, find the Person in the List, write the Person's name to the first line in a file, if file already contains a line replace with the new name. Leverage ChatGPT to create the code for you in both languages. Just some ideas, lol
really nice approach to monitoring performance. can you make a similar video but with java profiling tool to detect which specific part of the code must be reworked?
I work on both Java and go, your results are similar to my observations. Java consumes memory due to too much of autoconfigurations which involves hell lot of classes + some of jdk had garbage collection issue but if you develop an enterprise ready application in go with distributed tracing, logging, metrics, database writes heavy operation etc, their performance is almost equivalent. I had to manually write all those functionalities in go Lang due to lack of autoconfiguration and libraries
Very good point. Java frameworks like Quarkus are doing a lot to make large scale application development easier. All of that stuff it's doing will affect runtime performance.
Small improvement suggestion: In Java you could use a record instead of a class. This shouldn't have a big impact on the test results but at least spares you a bit of typing. It would have been great to include a base Java and a Spring/Spring Boot comparison deployed into a java21 image container here as well just to see how much of an impact Quarkus and the native container optimization really yields. So far I couldn't convince anyone at our company to try out Quarkus. Just a question out of interest: Are you going to create a benchmark framework where similar tasks are done by various language implementations and then release your findings to the public? I just stumbled across an other video and then this one was recommended to me, and to me it looks like your videos basically doing that but just with a smaller and more comprehensible scope. So a combination of runtime analysis of different languages for various tasks would definitely be helpful, I guess
It would be interesting to see what happens if you push requests to the limits and how high that limits are. Additionally for the Java it can be build to native image with spring boot as well. It sometimes not that smooth though but honestly I expect it to perform better with spring boot.
Idk, native image crashes randomly and have lower performance than jited code atm. It's good only (if not crashes) for low traffic applications on serverless.
I have the same limits for both: github.com/antonputra/tutorials/blob/main/lessons/201/deploy/java-app/deployment.yaml#L27-L33, and I run them on dedicated nodes using the ESXi Hypervisor.
@@AntonPutra I do understand, what I wanted to say is what happens if you push client requests higher and higher. The load seemed to be not that high, so the light load conditions were tested but what would happen under high load? It can be really detrimental in real world.
Thia is all good, checking how well it performs, but if its not throttling, anything is fine as long as client latency is not out of whack. I think some stress testing will also give good comparision, like you did with node and go
Not always true. Paying for each CPU cycle in the cloud you can easily get out of budget on scale. That is why optimization and algorithm knowledge is valuable again - it helps to save money.
Great job! But a few comments: Spring supports building native images as well, and they have maven/gradle plugins and a dedicated project Spring Native for this case. Actually, we are using it in production and building most of the Spring apps in native images. Summary: GO is faster, then JVM based stuff, well no surprise here :) In general, Quarkus doesn't give anything interesting compared to Spring, it's just a bit more modern and doesn't have much legacy stuff. What might be interesting to look at in this regard is Micronaout, because it does a fundamentally different Framework (compile-time and supports native images out of the box in comparison with runtime Spring with additional projects and layers for native support). Most likely Micronaout will show similar to GO numbers.
Thanks! There is a very small difference in terms of scalability; both are small with a fast startup time. I think Go is a little more efficient, so potentially you would need fewer compute resources.
@@AntonPutra the way java is using memory with GraalVM is very smart, is like observing the needs, then optimise the RAM needs. This could suggest that we could provision the JAVA container with a smaller POD in term of RAM. My concern is: how well does java handle random peaks? if we have 200 req/s, than right after the RAM stabilises suddenly we get 500 req/s, how well does JAVA handle that peak? is java going to panic and ask for wayyyy more memory than it actually needs? if this is the case, than the JAVA app may actually crash for insuficient container memory. Does it make any sense, what i've just said?
@@AntonPutra I was thinking about CPU and memory benchmarks on the NODE, i.e. what Kubernetes vs K3s eats of the Node performance. Otherwise, I just discovered the ClickHouse and meilisearch databases, it seems really good. (sorry for my English)
Would be good to see a native build test with GraalVM in comparison. Furthermore can Quarkus use Mutiny as Reactive Framework - maybe this would bring the two closer together as well.
I do think you should do some kind of load testing on the cheap 5$ instances. For example how many requests these cheap vps can handle before they crash, using golang, rust, php etc.
thanks! it's just open source and i actually have dedicated youtube tutorials how to measure, cpu/memory/vpc etc.. here is a dashboard and promql queries for this specific video - github.com/antonputra/tutorials/blob/main/lessons/201/dashboard.json java metrics - github.com/antonputra/tutorials/blob/main/lessons/201/java-app/src/main/java/com/antonputra/ImageResource.java#L51-L59 golang metrics - github.com/antonputra/tutorials/blob/main/lessons/201/go-app/metrics.go#L13-L27
@@LawZist use summary in edge cases when you have a single instance of the app and you can only scale vertically, cause it's not possible to aggregate them over multiple instances, for example to get p90 percentile for 5 replicas of your app. With summary prometheus compute p90 on the client itself. Use histogram in all other cases th-cam.com/video/WUBjlJzI2a0/w-d-xo.html th-cam.com/video/VjFFzGFyVlY/w-d-xo.html th-cam.com/video/dMca4jHaft8/w-d-xo.html th-cam.com/video/ff_XHm96PKQ/w-d-xo.html
JIT optimization are taken out in quarkus graalVM builds for obvious reasons. While the benefits of being lower level from Graal are great, JIT optimizations are not to be underestimated and they start to trigger later on the execution so they will be less visible at first
Java runtimes were historically designed to consume the resources of the whole VM so may be you can compare a Java app running on a JVM (not a native image but a hotspot JVM) on a VM with 4 cores and 4 GB RAM vs a go app running on Kubernetes using that same VM
It would be interesting if you compare go vs java non native, as non native should have better performance than native. You compile java to native only if you are building a CLI or a lambda, when you need fast startup.
It would be good to change a bit what the application is doing. In our company we have a piece of code that is meant to validate if we don't have any delays in network stack. To do so we tell the app to generate random 1000 bytes and sent that to client. With that nothing is cached.
i have couple... How to Monitor/Instrument Golang with Prometheus (Counter - Gauge - Histogram - Summary) - th-cam.com/video/WUBjlJzI2a0/w-d-xo.html OpenTelemetry Golang Tutorial (Tracing in Grafana & Kubernetes & Tempo) - th-cam.com/video/ZIN7H00ulQw/w-d-xo.html
Java native images give slower performance at runtime than normal jars because of the lack of hotspot optimizations at runtime. To achieve a similar performance it should be optimized through a previous profiling process.
Well, when you deploy to Kubernetes, you have cgroups and other constraints that could affect performance. But as soon as I find a use case where Java performs better, I'll make an updated video-maybe something like a Kafka consumer/producer data pipeline. I'll see.
► What should I test next?
► AWS is expensive - Infra Support Fund: buymeacoffee.com/antonputra
► Benchmarks: th-cam.com/play/PLiMWaCMwGJXmcDLvMQeORJ-j_jayKaLVn.html&si=p-UOaVM_6_SFx52H
Are you sure that both apps are using same amount of connections in pool? Connection pool often makes most of the performance diff in this kind of benchmark. Quarkus defaults to 20 concurrent connections, and pgpool to 4 or runtime.NumCPU() from what I have read.
Have you check performance for more than 20 connections in a pool?
@@ooijaz6063 I used the defaults, but for the next tests, I'll double-check how many connections are actually opened on the PostgreSQL side.
Perhaps you could also try helidon SE instead of quarkus. Helidon was built from the ground up by Oracle labs to use the latest java tech like virtual threads to serve requests
@@111segasonic thanks i'll try it out
I’d love to see Rust thrown into the mix as well!
Wow, this is really good. The setup (kubernetes cluster, prometheus, grafana ...) deserves another video.
Thanks! Just in case, the source code with all of these components is in my GitHub: github.com/antonputra/tutorials/tree/main/lessons/201/monitoring.
Hey, can you make a video on how to setup this in local? May be with k8s supplied with docker desktop if relevant?
@@rajivkumar-ub6uj i think so, the easiest way is just package everything as docker compose or perhaps just use local minuke cluster. i'll think about it
@@AntonPutra yes, compose is the best way for larger audience. Would appreciate if you can share the compose config for this, thanks in advance
@@rajivkumar-ub6uj ok
A benchmark must be like this. State of art. Good job!
❤
What is state of art mean ?
@@ChengPhansivang i guess something that people can relate to :)
Please do Java Spring Boot (Native) vs Spring Boot (JDK) VS Quarkus (Native) vs Quarkus (JDK)
ntoed!
And add Micronaut (Native & JDK) to this chain, plz
I am go fanboy but I really like applications written in Quarkus. My first language was Java and it is mind-blowing how fast and light Quarkus feels compared to Spring
some people say it is slower than jvm based, I'll see if I can test it
You would be surprised how far spring came his way. With that being said, for long running app spring boot as none aot compiled would remain faster thanks to the jit compiler. Quarkus is really only good of you need fast startup or low ram consumption
@@lufenmartofilia5804 good point
@@lufenmartofilia5804 will test, when you say long running, how long?
@AntonPutra long running is at least 10,000 tx before you start measuring. In the real world, weeks or months...
Finally! A detailed comparison that just doesn’t test the /hello-world endpoint
haha, thanks!
This second test scenario is absolute perfection in testing real world applications. It's easy to get excited about a performance difference of like 400% (for example) in a synthetic benchmark, but by including database, storage and (de)serializing, it gives a much more nuanced picture of how it would actually scale and perform. In this case I would say both applications performed well and comparable. I'd be interested in a bit of a deeper dive in these applications by including opentelemetry and seeing what functions might bottleneck.
Thanks. Well, in some tests, I used OpenTelemetry clients with this Prometheus client in both Go and Java. I'm wondering what else you would instrument besides these function calls to S3 and the database. I might include it in the following videos.
example - github.com/antonputra/tutorials/blob/main/lessons/201/go-app/images.go#L50-L62
@@AntonPutra Must have missed that detail, very well done and thanks for the reply!
@@TweakMDS thanks!
But it doesn't do that much. The programs doesn't change any data. It just uploads it.
@@GBXS I'm thinking about adding an additional test with Kafka consumer/producer and perhaps a simple ETL pipeline. Any suggestions?
This is definitely the best DevOps channel.
❤
Love these benchmark videos, nice work
thank you! :)
Love these benchmark videos, your work is amazing!
❤️
Interesting comparison, BUT:
- the first tests does not test the startup time itself (should be
I try to improve each time I create benchmarks. Next time, I will definitely use the v2 Go SDK and apply some other recommendations from your side. Thank you for taking the time to leave this feedback.
@@AntonPutra glad I could help and you haven't taken it as a personal attack or something :D
I really value your videos and open source code for every video!
Looking forward to seeing more of them.
@@DillPL thank you! i actually implemented your suggested idea in the new video and reduced the size by 6 mb (45 -> 39) :) - th-cam.com/video/56TUfwejKfo/w-d-xo.html
will try other tips next as well and finally update that sdk lol
From the whole video I have profited so much in percentails. You have clear so much
cool
I really admire the effort you put into describing why you chose your testing methodology as well as the testing itself
бро ты красавчик, ничего лишнего, все по делу, качество и битрейт на высоте, видосик красивый, респект!
spasibo❤️
First of all, this is the best content on youtube so far.
Well done. Thank you!
thank you! :)
this is so professional!
I love it!
please do bun vs deno v2
since deno has gotten npm compatibility , the only difference now between bun and deno (aside from being written in zig and rust) is the speed (I think , both have gotten very nice std library)
please do a benchmark comparing everything!
Excelent video, thank you so much!
Do you think you could show this same benchmark comparing quarkus and go pushing both to its limits as you've done in your other videos?? I'm really curious about how they both compare on performance under heavy loads. My guess is quarkus would break first but I wonder how big is the difference. (I'd expect both are kinda close)
Excelent work, and I'm very excited for your future videos!
please do c# vs Java, use minimal api with AOT for c# and GraalVM or whatever AOT thing Java has.
ok will do soon!
Would love to see C# vs Go
C# vs Go vs Java would be nice
any specific test scenarios? or the same
@@AntonPutra your current test scenarios are very good so I wouldn't change anything. Regarding C# I would use LTS version (dotnet 8) which is the fastest one amongst other versions according to Microsoft.
@@krzysi3k-yt ok, I'll maybe do it next
Rust - same tests
Great videos like the rest of what you do. I'm using your video sto improve my knowledge on cloud/kubernetes area.❤❤
thank you!❤
Great video! The benchmarks were really helpful. Keep up the great work!
thank you! will do
Nice work! The explanation around the benchmark is easy to understand and full of information there. IMHO, you should start to build your own courses on Udemy :)
thanks! maybe
Great video. Did you use the Reactive version of Java or the OS Thread version? If the Blocking version it would be interesting to see what would happen if you ran those on virtual threads. And what were the results of the reactive version.
thanks, it's been a while, but you can find a link to the source code in the description
@@AntonPutra I did. There are 2 versions and it looks like the regular hardware thread version was used. Since these workloads are all IO it would be better to use virtual threads (annotate with @RunOnVirtualThread) or the reactive version. But maybe the reactive version was used. I have always seen better performance on Quarkus than Go so these results are a bit surprising.
@@AntonPutra But your work is amazing. And I shall surely learn from it, especially the observability.
Nice job, I would like to see Test 2, but with higher RPS
Okay, I might just include additional screenshots under lesson '201' in my GitHub repo
@@AntonPutra It would be great, thank you Anton!
Please test dotnet lastest 8 vs go thanks
ok, comming next
@@AntonPutraensure to use Minimal APIs and compile it AOT.
Very nice video! Seems like if cost-cutting is of great concern, you'd lean towards go to keep CPU utilization down. I would love to see a similar comparison video between Java and JavaScript/Node.js.
thanks! noted
Love your videos! What tool do you use for creating those amazing animations and mounting videos?
Interestingly, in your Test scenario 2, your Quarkus app is spiking in DB latency while having constant times in between, as if the Postgres client would be idling to gather the queries (or waiting on a lock?) and send them in bursts.
If this is indeed the case it does make the results a bit harder to pull conclusions from.
yeah, I noticed it
Okay, you got me now. I will start trying prometheus and grafana. The question I have is which tools do you use for load testing? You are using word "client" for this. I assume you use some kind of tools, like jmeter, k6 or?
Great video! The test seems a bit unfair to Java, as much of the time was likely spent serializing JSON, which can vary significantly based on the library used. It would be interesting to see results with a more complex application that uses intricate data structures, as Go’s garbage collector introduces pauses and latency, while Java is known for its efficiency in these scenarios.
Now, on the topic of environments and the test setup, I have a few questions. Is it statistically valid to compare a test run on Kubernetes with minimal requests? For the comparison to hold up with few requests, each application's runtime environment would need to be exactly the same, which can be difficult to guarantee given the cluster load and where each pod is deployed.
It might actually be easier to run the test on your local machine, where you can control the resource allocation. However, I noticed you’re using a Mac, so factors like virtualization might make the results diverge even further from what you’d see in a production environment
The explanation why java reduces memory usage is pretty simple: gc
Go has gc too...
The explanation is because java uses a fixed heap. Normal java reserves the memory from the system upfront and you will see no change for all the run.
The Quarkus optimizations makes the internal HEAP metrics visible to K8s. But the particularities of java are still visible in those behaviours as defaulting to reserve a lot of memory upfront.
@@framegrace1 the java GC could be different at different jvm implementations. But basically it works by simple principle. The jvm perform gc then it see that heap is overused. It based on heap limit.
So, in this case jvm application started and used some heap. The heap usage isn't reached the GC limit - so don't need to perform gc. When traffic comes to jvm application - it increases the count of created objects and as consequence - increased heap usage. And when heap usage limit is reached then jvm perform gc and all objects created at start of application has been deleted.
I don't know how GO gc works and looks like it has partially different implementation.
There was a non-blocking Netty server implemented with Spring Reactive Web, which is more efficient.
for databae approach use R2DBC the reactive nonblocking data repository.
btw spring also support graalvm and it is not outdated.
ok thanks, it's not outdated just it's been around for a long time
Quarkus uses non-blocking netty
great demo. as Java dev it hurts seeing java losing even with quarks native build 😢😢
I'll make some more with improved Java soon
Seems like Java 21 was used but Virtual threads wasn't used for the Quarkus application. Wasn't that the whole point to using newer Java version with the performance improvements and non-blocking reactivity APIs?
yeah, i used java 21. I'll make sure to test virtual threads next time, maybe try to compare different java frameworks as well
@@AntonPutrahey, thanks for making this video.
Just that needed to point out the code looks to be done in a older/traditional method even though Quarkus has annotations that resolves traditional blocking calls that modern programming languages like Go probably already has underlying.
Great detailed video as always!
Maybe I'll try this out on my local machine to test out too!
@@henryong7788 I'll soon be comparing Quarkus with Spring Boot, and I'll make sure to use the latest language features.
Virtual Threads are not better in performance compared to fully reactive code.
Quarkus fully reactive or Spring fully reactive will always beat virtual threads, both in performance and resources usage.
@@EricSouzarys good to know thanks
What about micronaut? Would love some benchmarks on this 😊
ok, i'll take a look, i'll get back to java soon
Great video! Python vs Node plz with the same scenario :)
Loved the video, subscribed!
thanks!!
Amazing video, great job !!
thank you!
thanks for sharing.. can you do it with nodejs :P?
hey good test! can you test with go-chi instead of Fiber? go-chi is more optimized in terms of memory usage so that might explains why Java was using less memory in that first test.
Overall, good video! Keep it up
Thank you! I used Chi for one of my projects, but I think memory usage doesn’t play a major role in the user experience, such as client latency etc..
I've been seeing these videos for a while and all I see is my railway bills
😂 i have some aws credit
Love these benchmarks! 🎉
thanks! i try to add some extra
Nice video. One comment. Scale up/down is increase or decrease the machine resources like CPU and memory. Scale in/out y horizontal scaling ;)
Like always you rock, can you make a video about database architecture for production like MySql Replication Group etc, Thank you
thank you! let me see
good test after previous tries ;) but I would not accentuate memory consumption at the start of compiled java, as it does not affect anything. Also it looks like cpu doesn't do anything, so no reason to seriously compare 3% with 5%. But latency values are valuable! PS: looking at the low cpu consumption test I got an idea to test cpu intensive application. Try to create something like a redis (hashmap is fast, lets use treemap and its concurrent versions), the app will add, update and get some data, for example count of values that are greater than received in a controller. PPS: interesting to see how regular java 21 works with virtual threads, but I heard that java file io on linux is synchronous and only 22 will be modern, so it could be a reason why you got these values in the current test. Also testing regular java in a container is tricky, it’s better to test different Xmx-Xms values first, I mean starting java under the memory limit of 500mb is not the same as with 2000mb (so using compiled java leaves that headache, but compiled has lower throughput and latency :) )
Thanks, I appreciate your feedback.
Interesting. These tests could be extended to compare Hotspot VM, Generational ZGC and a few other switches. Can you make a video of your entire testing setup (focusing on docker, kubernetes, prometheus and grafana) from scratch? I think it's totally worth it.
Nice comparison! Though I wouldn't ever compare Go vs Java native for long time runners as this one. It is true that the metrics of java at startup time would be much (much) worst but bear in mind that java has been built thinking in the startup as an edge case scenario and the JVM does a lot optimisations while the program is running, it would be interesting if java is able to beat Go in the long run. In the short run I think that there is no possible discussion and Java Native is just a work around.
Add rust and javascript to the mix. Thank you for your channel
will do, i'm thinking about webassembly vs js, what do you think?
C# vs Go would be amazing to see. Maybe add to test 2 some simple reads from the database, and maybe add test 3 with some simple data structure or general purpose calculations to see how well each language performs. Amazing content. I am currently writing a high performance C# application for the government with .Net 8 and it is incredibly fast. I wonder if .net 8 has been improved so much that might even beat Java at this point.
Can you elaborate on using simple data structures or general-purpose calculations to evaluate how well each language performs? I don't really want to run fibonacci anymore lol
@@AntonPutra You can try to implement 3 different types of algorithms (In addition to the 2 tests you already did in your previous video).
1- Searching Algorithms (linear search) - ex: Create a List of 1 million objects (person: {Id, Name} - Id: must be unique integer 1 to 1,000,000. Name: generate random string. Populate your list with 1 objects (Person). Test: Generate a random Int value from 1 to 1million and find the Person object (by ID) using the random generated Int, and get the Name, then find the object in the list (by Name) compare and validate both ID match
2- Sorting Algorithms (sort the entire 1 million object (person) by name.
3 - File I/O Operations - generate random Int value from 1 to 1 million, find the Person in the List, write the Person's name to the first line in a file, if file already contains a line replace with the new name.
Leverage ChatGPT to create the code for you in both languages. Just some ideas, lol
@@gabrielmartinez2455 thanks! i'll try it
really nice approach to monitoring performance. can you make a similar video but with java profiling tool to detect which specific part of the code must be reworked?
I work on both Java and go, your results are similar to my observations. Java consumes memory due to too much of autoconfigurations which involves hell lot of classes + some of jdk had garbage collection issue but if you develop an enterprise ready application in go with distributed tracing, logging, metrics, database writes heavy operation etc, their performance is almost equivalent. I had to manually write all those functionalities in go Lang due to lack of autoconfiguration and libraries
Very good point. Java frameworks like Quarkus are doing a lot to make large scale application development easier. All of that stuff it's doing will affect runtime performance.
Could you do the same test for Kotlin and Java ? Or Kotlin and Go. Please 🙏
ok let me see
@@AntonPutra Would love see a Quarkus and Kotlin benchmarks compared to Spring Boot and Kotlin
@@belkocik 🫡
Wonderful content Anton!
thank you!
Small improvement suggestion: In Java you could use a record instead of a class. This shouldn't have a big impact on the test results but at least spares you a bit of typing.
It would have been great to include a base Java and a Spring/Spring Boot comparison deployed into a java21 image container here as well just to see how much of an impact Quarkus and the native container optimization really yields. So far I couldn't convince anyone at our company to try out Quarkus.
Just a question out of interest: Are you going to create a benchmark framework where similar tasks are done by various language implementations and then release your findings to the public? I just stumbled across an other video and then this one was recommended to me, and to me it looks like your videos basically doing that but just with a smaller and more comprehensible scope. So a combination of runtime analysis of different languages for various tasks would definitely be helpful, I guess
thanks, yes i'll get to Java soon and I'll try to improve a few things
@@AntonPutra Can you try using both Spring Boot Native and Quarkus to see how much of a performance difference they have
@@MovinduLochanaWijethunge yes will do
Thanks for you video! I really like it. Could you do the same tests for Spring vs Quarkus?
thanks, will do, but first rust vs go
Would be cool to see in a future video the framework web for Kotlin called Ktor.
noted!
Woow amazing effort Man, how about Rust vs Go ?
It would be interesting to see what happens if you push requests to the limits and how high that limits are.
Additionally for the Java it can be build to native image with spring boot as well. It sometimes not that smooth though but honestly I expect it to perform better with spring boot.
Idk, native image crashes randomly and have lower performance than jited code atm. It's good only (if not crashes) for low traffic applications on serverless.
@@ooijaz6063 I haven’t loaded my test app extensively but for me it worked ok and had better performance.
It may change though with strong adoption of virtual threads in next few years and servlet api will be good again.
I have the same limits for both: github.com/antonputra/tutorials/blob/main/lessons/201/deploy/java-app/deployment.yaml#L27-L33, and I run them on dedicated nodes using the ESXi Hypervisor.
@@AntonPutra I do understand, what I wanted to say is what happens if you push client requests higher and higher. The load seemed to be not that high, so the light load conditions were tested but what would happen under high load? It can be really detrimental in real world.
a good indeed comparison.
only one thing wanna further look into,
how do same test behave at high throughput like 500 / 1000+ req/s
Thanks, I may include screenshots or just improve my tests in the future.
Very informative!
How about comparing performance of java vs python stream processors in Apache Flink?
thanks, yes, i was thinking about spark/flink and different apis: python, java, scala, etc.
Thia is all good, checking how well it performs, but if its not throttling, anything is fine as long as client latency is not out of whack.
I think some stress testing will also give good comparision, like you did with node and go
'll come back to java soon with improved benchmarks
"There are no solutions. There are only trade-offs" Thomas Sowell
servers are cheaper than developer time
true
Not always true. Paying for each CPU cycle in the cloud you can easily get out of budget on scale. That is why optimization and algorithm knowledge is valuable again - it helps to save money.
Great job!
But a few comments:
Spring supports building native images as well, and they have maven/gradle plugins and a dedicated project Spring Native for this case. Actually, we are using it in production and building most of the Spring apps in native images.
Summary: GO is faster, then JVM based stuff, well no surprise here :)
In general, Quarkus doesn't give anything interesting compared to Spring, it's just a bit more modern and doesn't have much legacy stuff.
What might be interesting to look at in this regard is Micronaout, because it does a fundamentally different Framework (compile-time and supports native images out of the box in comparison with runtime Spring with additional projects and layers for native support). Most likely Micronaout will show similar to GO numbers.
thank you for your feedback. i'll get back to the java world soon, maybe next week, and make a few improvements
this is very neat, i love it
thank you!
i like this working. You are so nice!!
thank you!
I'd be interesting to compare Hotspot (various GC) vs GraalVM(Quarkus, SpringBoot)
ok let me see
It would be interesting to test long term throughput in this comparison.
@@terribleprogrammer how long? day, 2, a week?
@@AntonPutra one week would be interesting. You can also mix up jvm, graalvm and go Lang in a single video
@@terribleprogrammer ok, i'll see if it makes any difference and if it does i'll make something
Very Nice! great analysis
thank you!!
Enjoyable video. Subscribed.
thank you! more to come
Great work ! What about C++ vs GO ?
will do! :) any specific frameworks on c++?
@@AntonPutra drogon framework is very very fast and well written
great video!!
thank you!
Hi. Nice video indeed. Can you explain why Java uses significantly less memory under load then in idle run?
nice comprarison, this could work great on a batch.
how is it going to compare on an app, that has peaks during a specific time of the day?
Thanks! There is a very small difference in terms of scalability; both are small with a fast startup time. I think Go is a little more efficient, so potentially you would need fewer compute resources.
@@AntonPutra the way java is using memory with GraalVM is very smart, is like observing the needs, then optimise the RAM needs.
This could suggest that we could provision the JAVA container with a smaller POD in term of RAM.
My concern is: how well does java handle random peaks?
if we have 200 req/s, than right after the RAM stabilises suddenly we get 500 req/s, how well does JAVA handle that peak?
is java going to panic and ask for wayyyy more memory than it actually needs?
if this is the case, than the JAVA app may actually crash for insuficient container memory.
Does it make any sense, what i've just said?
@@ionutale1950 yes, it does. i'll try to configure the client next time to simulate such spikes when I compare spring boot with quarkus
It's great if you can benchmark framework from bun runtime like Hono and ElysiaJS
ok noted!
Hi, Nice job, thank you.
idea for next benchmark test : Kubernetes vs K3s
They are platform so how would you like to compare?. If I have some nodes & VMs then I will stick to K8s, otherwise K3s.
ok, I'll see if it makes sense. I'll create some benchmarks or maybe just make comparisons.
@@AntonPutra I was thinking about CPU and memory benchmarks on the NODE, i.e. what Kubernetes vs K3s eats of the Node performance.
Otherwise, I just discovered the ClickHouse and meilisearch databases, it seems really good. (sorry for my English)
@@picatchumm64 ok, got it, basically infrastructure test, how well both can handle load etc, and which one is more efficient/cost effective
Would be good to see a native build test with GraalVM in comparison. Furthermore can Quarkus use Mutiny as Reactive Framework - maybe this would bring the two closer together as well.
ok noted!
I do think you should do some kind of load testing on the cheap 5$ instances. For example how many requests these cheap vps can handle before they crash, using golang, rust, php etc.
Could you try GraalVM next?
yes soon
Great Benchmark! can you share the promQL for the metrics? is it some plugin or you wrote it by yourself? thanks
thanks! it's just open source and i actually have dedicated youtube tutorials how to measure, cpu/memory/vpc etc..
here is a dashboard and promql queries for this specific video - github.com/antonputra/tutorials/blob/main/lessons/201/dashboard.json
java metrics - github.com/antonputra/tutorials/blob/main/lessons/201/java-app/src/main/java/com/antonputra/ImageResource.java#L51-L59
golang metrics - github.com/antonputra/tutorials/blob/main/lessons/201/go-app/metrics.go#L13-L27
@@AntonPutra is there any reason to prefer summary over histogram? And can you please share the link for your measure tutorials? Thanks a lot!
@@LawZist use summary in edge cases when you have a single instance of the app and you can only scale vertically, cause it's not possible to aggregate them over multiple instances, for example to get p90 percentile for 5 replicas of your app. With summary prometheus compute p90 on the client itself. Use histogram in all other cases
th-cam.com/video/WUBjlJzI2a0/w-d-xo.html
th-cam.com/video/VjFFzGFyVlY/w-d-xo.html
th-cam.com/video/dMca4jHaft8/w-d-xo.html
th-cam.com/video/ff_XHm96PKQ/w-d-xo.html
@@AntonPutra thanks!!
Why does Java's memory usage is high when it is idle? Also will it also go high when it is idle after processing requests?
JIT optimization are taken out in quarkus graalVM builds for obvious reasons. While the benefits of being lower level from Graal are great, JIT optimizations are not to be underestimated and they start to trigger later on the execution so they will be less visible at first
ok, i was thinking of comparing them directly
this is a great video! tnx!
my pleasure!!
Java runtimes were historically designed to consume the resources of the whole VM so may be you can compare a Java app running on a JVM (not a native image but a hotspot JVM) on a VM with 4 cores and 4 GB RAM vs a go app running on Kubernetes using that same VM
It would be interesting to see a test to failure, who and under what load will start throttling
yes will do with improved java next time
Perfect work 👍
thank you!❤️
Would be nice to see Go (Fiber) vs Bun (Elysia)
ok noted!
It would be interesting if you compare go vs java non native, as non native should have better performance than native. You compile java to native only if you are building a CLI or a lambda, when you need fast startup.
ok noted!
Please make a comparison for Java Vert.x vs Golang Fiber
added!
i wona see spring boot native image vs Quarkus vs go
spring native is framework like Quarkus so it's nice to compare this 2 framework
ok noted!
Dhanyavad
my pleasure!
can you please do a GO vs node.js Lambda testing? with cold start time, memory usage and other metrics
ok i already have some lambda benchmarks in that playlist but i'll refresh it soon
It would be good to change a bit what the application is doing. In our company we have a piece of code that is meant to validate if we don't have any delays in network stack. To do so we tell the app to generate random 1000 bytes and sent that to client. With that nothing is cached.
Please make a tutorial on Golang.
i have couple...
How to Monitor/Instrument Golang with Prometheus (Counter - Gauge - Histogram - Summary) - th-cam.com/video/WUBjlJzI2a0/w-d-xo.html
OpenTelemetry Golang Tutorial (Tracing in Grafana & Kubernetes & Tempo) - th-cam.com/video/ZIN7H00ulQw/w-d-xo.html
Java native images give slower performance at runtime than normal jars because of the lack of hotspot optimizations at runtime. To achieve a similar performance it should be optimized through a previous profiling process.
Thanks for the feedback, someone already mentioned that. I'll run some tests in the near future
I would also compare compile time of quarkus native and go executables....
ok noted
I run some tests a while ago just benchmarking algorithms with different languages. To my surprise Java always run them faster than GO
Well, when you deploy to Kubernetes, you have cgroups and other constraints that could affect performance. But as soon as I find a use case where Java performs better, I'll make an updated video-maybe something like a Kafka consumer/producer data pipeline. I'll see.
I subscribed...❤
thank you!
It would be interesting to add C, Rust, NodeJS and Python to the mix.
c# vs java would be nice :) as they both use byte code, JIT and GC :)
I would like a test that includes all programming languages up to now and allows for ranking them.
there are a lot of variables especially in the cloud, noise neighbors etc so it would be hard to compare all of them...
Is this using graalvm oracle or graalvm community edition ? very interest on comparing graalvm oracle AOT & JVM based vs golang
it should be the community version. i'll make some more java content in the near future.