► What should I test next? ► AWS is expensive - Infra Support Fund: buymeacoffee.com/antonputra ► Benchmarks: th-cam.com/play/PLiMWaCMwGJXmcDLvMQeORJ-j_jayKaLVn.html&si=p-UOaVM_6_SFx52H
C# vs Java - this will be interesting. Please have in mind that C# can be compiled as a fully native app (via AOT) and PGO mode can be set to full. It would be interesting to see how much of a difference it can do.
This is great insight... CPU usage remains low but end users are seeing massive spikes in latency. I know in the past i've made the mistake of viewing low resources usage equating to faster processing or higher throughput. In reality its just one piece of the puzzle. Great video
Amazing work on this video. This is super useful data and answers a lot of questions I've had about rust performance and I commend you even if this video doesnt get millions and millions of views.
Thanks for the great content! I noticed around 1:38 that the calculation for availability was mentioned as the ratio of failed requests to total requests. I believe it should actually be the ratio of successful requests to total requests (availability = successful requests / total requests). Just thought I'd share this in case it helps clarify things for others. Keep up the great work!
Thanks for making this video. I too was surprised by the results: why do they show Rust so much slower (latency) and why does it consume more CPU and fluctuating level of CPU (there is no GC to add periodic overhead)? I took the code and ran test 1 (static get devices) locally (M1 Mac Studio) to see if I could repro your results. I found something different. Both Go and Rust performed incredibly similarly: about 570us per request... But I was driving it with 50 concurrent threads achieving about 27k requests per second. I get similar results at 100 concurrent threads (50k requests/s). Further, with these tests I see Rust taking 55% CPU / 9MB RAM consistently, whilst Go is fluctuating 130-180% CPU / 20 MB RAM during the run. My results are not directly comparable with yours (not network between the client and server, different machine, etc)... but the relative performance is comparable. I see latency parity between the two languages. This is what I would expect for such a simple test... returning static data, both compiled and optimised languages. As expected I see Go using more memory and a little more CPU to cover off its overheads like GC. Can you double check your Prometheus graphs to make sure the CPU usage graphs are the right way around? I can't come up with a suggestion for the differing latency figures... Any ideas Anton?
thank you for taking the time! The issues was a client running in the Kubernetes pod (job), i have improved test design and now can achieve ~20k RPS in Kubernetes with 2cpu and 256mi memory. it does not mean previous tests were incorrect it just client wasn't able to generate more load Also, someone raised a PR to improve Rust code, so I will retest and if it makes a difference release a result - github.com/antonputra/tutorials/tree/main/lessons/203/rust-app-v2
@@AntonPutra Thanks for replying. I look forward to seeing the updated results. Coming back to the CPU graph. I saw from your video today that all three Go frameworks generated load fluctuations. Your graphs for test1 in this video show Go with almost completely level/smooth CPU and Rust with quite variable CPU usage. Can you confirm that the graph key is the right way around... That said, I am super impressed with the performance of Go in these tests.
@@FrankTaylorLieder I'll double-check. This time, I generated the load from 20 different pods in K8s instead of just one. Also, I added random sleep in the Go client, which could potentially cause spikes. When I reduced the random sleep time interval from 1000ms to 40ms, the graph looked much smoother. Old go client - github.com/antonputra/tutorials/blob/main/lessons/201/client/request.go#L16-L18 Updated Rust code - github.com/antonputra/tutorials/tree/main/lessons/203/rust-app-v2
@@baxiry. Could you elaborate more, please? I understood that all packets reached services via network, but they were "dropped" i.e. HTTP 503 is returned ? Maybe there was a low timeout in the client?
thanks, but i'm getting a lot of tips how to improve Rust code as well as couple of PRs. so I'll retest and release results soon! PR - github.com/antonputra/tutorials/pull/253 PR - github.com/antonputra/tutorials/pull/252
@@AntonPutra The same can be argued about the Go code as well. I am quite certain that if you tweak the Go code here and there, it could be a lot faster. But as for Go being a garbage collected language and going toe to toe with a non garbage collected like Rust is very impressive. Anyways, keep them tests coming. You actually made me revisit the thought of learning Rust in the future ;)
I think irrespective of the result, the Go performance is extremely good, for quite a high level language. It would be really interesting to see a top 3 most performant server frameworks for bother Go & Rust, then put them against each other. Quite a few people are saying Axom for Rust & probably FastHTTP would be expected to be the fastest for Go? I like that you consider real world usage & not just simply number of requests for a hello world endpoint, much more realistic.
thanks! i'm going to release a new benchmark GO (stdlib vs fiber vs gin) in couple of days, next Rust (actix vs axum vs roket), let me know if i should consider different frameworks
i have a few pls take a look: - github.com/antonputra/tutorials/tree/main/lessons/130 - github.com/antonputra/tutorials/tree/main/lessons/134 - github.com/antonputra/tutorials/tree/main/lessons/135 - github.com/antonputra/tutorials/tree/main/lessons/136 - github.com/antonputra/tutorials/tree/main/lessons/137 - github.com/antonputra/tutorials/tree/main/lessons/137 - github.com/antonputra/tutorials/tree/main/lessons/140
@@AntonPutra Hi! Does it possible to get metrics by path from nginx ingress controller? Can't find any video on this on your channel. I found only setup with fluentd that transform nginx logs to prometheus metrics. Is there any easier way to achieve that? I'm using AWS EKS with kube-prometheus-operator and wanna see nginx metrics on different path of my app like /api/some_method
I spent some time on Google, and as I understood it, this is only achievable if we specify many different paths in the Ingress K8S resource. In this case, we will see these paths in our metric. But we always need to update this configuration whenever we add a new endpoint to our application...
yeah i know, especially for rust. they recently deprecated rust runtime for lambda i think they don't spend a lot of time on maintaining and improving rust sdk
Having seen these results - I think I might put in some pull requests for Linux drivers written on Golang! (topical Geek joke). It was fascinating to see two compiled languages going head to head. I was surprised to see them being so neck and neck, Memory usage was the principal difference, and on that I have some comments (in no way a criticism of the video which was illuminating and objective, in running the two languages against one another in default configuration - like most people use them). Memory usage is not really an efficiency metric, because a garbage collecting language will only carry out a full garbage collect when necessary. Necessary being hitting the defined threshold. Typically this would be configured to 80% of available memory - you can see this when when the load increases and there is an initial spike. Resource pools and statically sized queues are also a reason that memory usage of an application can appear initially appear high, but then not get any higher. As an embedded software engineer, the approach was that memory is not there to be squandered but it is there to be used and abused, to keep down latency. Caching previously loaded and potentially still useful disk pages in RAM is another example of this. This is something the OS will do for the app if you let it - but that usage has to be accounted for against one process or another. So, the takeaway is - memory can be traded for latency - and you can code or configure an app to play to either of these as priority dictates. If you know the load in advance you can vary the budgets considerably. Caveats : I have had to say "the compression algorithm will take up more room than the data" and databases often default to linear searches because using an index is slower than just going through the data (the point being that you shouldn't assume complex algorithms make things quicker). IOPS on the database store can be an almost invisible bottleneck. I'd like to see a Node.JS app added to the mix. The compiled languages people expect it to be rubbish, but I suspect it would run Rust and Golang close on this app (if you are going to allow "go routines" you would have to allow multiple Node instances listening on the same port).
thanks for taking a time to leave this feedback i'll definitely use nodejs in the future benchmarks. by the way i got some PRs to improve my Rust code it i just release results which very different from what i had before
Go AWS SDK v2 is one of the most completed and well inline-documented implementations of the SDK and the S3 (and S3 manager) is the biggest module, so I guess the AWS put a lot of effort optimizing it.
would like to see Axum vs Actix vs Fiber but maybe too late for that? or maybe another episode with more metrics? I am not sure myself what additional metrics people want to see.
thank you, noted! i'll release a short benchmark Go (stdlib vs fiber vs gin) maybe tomorrow or after tomorrow. probably will do Rust (actix vs axum vs roket) next
@AntonPutra i sus pect actix will be more performant than axum but Axum has better dev experience? As axum is marketed to jave better ergonomics. But will be nice to see avtix and axum at scale, just to have it in the video books.
It's very interesting to see comparisons like this, especially with Actix and Fiber, which are among the fastest tools for backend development in Rust and Go, respectively. The specific requirements of each project can heavily influence the choice of tool. While Fiber boasts lower latency per request and less CPU usage, it does consume more memory. On the other hand, Actix might have higher latency and use more CPU, but it's significantly more memory-efficient. Considering that memory is more expensive than CPU usage in cloud services, this trade-off becomes even more crucial if you are deploying the service on-premises or in the cloud. Additionally, Go's ease of use makes it simpler to onboard new developers, even those without prior experience, whereas Rust might have a steeper learning curve. It's always a good practice to encourage developers to conserve memory allocation when working with Go, given its tendency towards higher memory consumption. These subtle observations can significantly influence the choice of tool depending on the project's specific requirements.
i've been using Go for the client to generate load and had to use 8xlarge instance, when i switched to Rust i only need 4xlarge ec2, 50% saving for me to run these tests :)
next time i just run test between those two, if you think i should instrument anything specific like how long it takes to serialize json for each framework let me know
@@AntonPutra In fact I have saw reports from Discord and why they have change from Go to Rust. Their graphics do not show the same behaviour mostly with CPU, that is why I was wondering if it was, may be , a matter of framework. Will see ... Anyway your work is 👍 . Congrats!
@@LinuxForLife I saw it a while ago, but I hope someone from Discord can explain their use case where Rust performs better so that I can reproduce it. I'm not hiding anything, the source code is public. It's just examples from the official docs of the SDKs and frameworks I use🤷♂
Fascinating results, Go's latency and CPU usage is actually lower than that of Rust in the two tests. Awesome remark about not scaling pods horizontally based on CPU and Memory alone, as degradation happened before they even started to peak.
Great video, -If I could, I would suggest doing the same test comparing golang, C# and java given the popularity of those and the fact they are GC- (You did videos for this as well, nice!). For this test, for me, the CPU usage is weird, I was expecting to be the same or rust using less CPU, maybe golang is using better compiler optimizations?
Hi there, wonderful work, i would like to see Free Pascal (mormot ) vs go or Rust, since free pascal mormot qualifyes 12º in tech empower metrics, cheers
Great test! Is Grafana able to capture and display spans and traces like Datadog, if you're familiar with that APM? For example, an API call is made to web service A, which in turn calls web service B that persists some data to a database. I'm interested in Grafana's ability to display the chain of these calls, their durations, the API hostname, URL, parameters, etc, on a time-line, or in a waterfall view. In Datadog, this is known as a "trace". Does Grafana have a similar concept?
@@AntonPutra it's kinda better, but probably using axum won't show significant benefits here. While the only interest here was it built in release mode?
Please do Axum vs Actix comparison. I got a lot of feedback how to improve this one as well as few PRs. I'll release update rust vs go maybe tomorrow or after tomorrow
i have old version of the client - github.com/antonputra/tutorials/tree/main/lessons/201/client as well as monitoring staff - github.com/antonputra/tutorials/tree/main/lessons/202/monitoring i keep iterating over and in some time in the future release it
The scales are a bit deceiving. They look more drastic than they really are. Considering how much less time it takes to write a program a Go language with concurrency, I would choose Go. We're talking about a 10% difference in performance right? Unless you're facebook, google scale it's good enough.
Not an expert but I think the Rust code can be slightly optimized: get_devices seems to allocate the vector every time it runs, which can be time consuming, and also you might squeeze a bit more performance with parking_lot and RwLock instead of Mutex. Another interesting approach would be to use message passing for updating the metrics instead. Other than that, great work!
@@AntonPutrait is better to stick with the language that you expert for building products rather than investing in every new languages or technologies. Because we just need a final product in the end.
Rust version probably spawns number of threads from the start, hence higher CPU usage. It would be more comparable to see Rust using Tokio for true concurrency but I have no idea what Actix uses behind the scenes. I would want to see you compare it with more popular framework like Axum.
thanks, i'm getting a lot of tips how to improve the Rust code so I'll retest and release results soon. I can also measure how many treads each app creates with this - "container_threads" Github - github.com/google/cadvisor/blob/master/docs/storage/prometheus.md#monitoring-cadvisor-with-prometheus
get_devices from main has the potential to be wildly inefficient. It's not even so much about the vector not being const (uuid! is validated at compile time). The problem may be in the mac and firmware fields of the Device struct. With the current implementation it's a whole wonder why Device has owned data as opposed to string slices. What you end up doing is paying the price for a heap allocation (as stated in the docs for From for String) which you don't need since this data can be a &'static str. On top of that one doesn't need UTF-8 strings to represent MAC addresses and firmware versions. Using an ascii string is something that both solutions would benefit from. Hopefully rustc is smart enough to figure all of these things out and not include all of the above mentioned overhead, I haven't analyzed the function that closely.
Thank you for the feedback, actually one of the PRs that i received is to change uuid to string - github.com/antonputra/tutorials/pull/252 as well as another one to improve performance - github.com/antonputra/tutorials/pull/253 I'll retest in a couple of days along with some other tips and release results
Hey! Sorry about stupid questions. If we use optimized OS, like Clear Linux or CachyOS, will we see some benefits? I saw different tests on Phoronix, and it looked like it can help. One PHP benchmark showed about 2x speed (but I think it was just benchmark without i/o). But those tests were launched without containers, if I not mistaken.
possibly, but most people use managed Kubernetes clusters like EKS (aws), GKE (gcp) and AKS (azure) which comes with pre built AMI (os images). i just want to cover the same use cases and environments that most people would use
@@AntonPutra interesting, especially about Clear Linux. I have to check, how Intel use it, I'm sure that they made it for servers 🤔 It's strange if cloud providers don't want to get some benefits
I think this shows that Discord was wrong to throw their entire codebase away for their "performance" issues with Go. There was something else going on there IMO and not just "Go slow we move to Rust haha"
I had a Golang client that generated load, and I had to use 8xlarge instances. After rewriting it in Rust, I was able to use 4xlarge instances, 50% infra savings for me for running these tests.
yeah i know :) take a look at these tutorials for now Monitoring EKS & EC2 instances with MANAGED Prometheus & Grafana - github.com/antonputra/tutorials/tree/main/lessons/130 How to monitor Persistent Volume usage in Kubernetes using Prometheus? - github.com/antonputra/tutorials/tree/main/lessons/134 How to monitor Containers in Kubernetes using Prometheus & cAdvisor? - github.com/antonputra/tutorials/tree/main/lessons/135 Monitor HTTP/REST API in Kubernetes using Prometheus & Nginx Ingress Controller - github.com/antonputra/tutorials/tree/main/lessons/136 How to Monitor Golang with Prometheus (Counter - Gauge - Histogram - Summary) - github.com/antonputra/tutorials/tree/main/lessons/137 How to Monitor Nginx with Prometheus and Grafana? (Install - Monitor - Fluentd) - github.com/antonputra/tutorials/tree/main/lessons/140 Prometheus Operator Kubernetes Tutorial - github.com/antonputra/tutorials/tree/main/lessons/154 How to Build Custom Prometheus Exporter? - github.com/antonputra/tutorials/blob/main/lessons/141 OpenTelemetry Golang Tutorial (Tracing in Grafana & Kubernetes & Tempo) - github.com/antonputra/tutorials/tree/main/lessons/178 Take a look at all lessons here in the readme - github.com/antonputra/tutorials/blob/main/docs/contents.md
Suggestion: Can you please change the graph colors to something more distinguishable. Although I can identify the colors but some people may not and its also easier to check the difference.
i interesting same test Actix (Rust) vs Fiber (Go) 1. Simple requests (without body or very small static body) 2. Simple requests (with body 1kb 100kb 1mb) [need read all data request to io.Discard] 3. simple request with database can you do this?
Why did not you use release profile settings geared towards performance ? Enable LTO=thin and codegen-units=1. Also these results does not seem correct. Did you compile the rust app in release mode ?
I got a couple of PRs as well as suggestions like this to improve performance. I'll retest it in a few days and release results. thanks for the feedback! PR - github.com/antonputra/tutorials/pull/253 PR - github.com/antonputra/tutorials/pull/252 and yes it was compiled in release mode - github.com/antonputra/tutorials/blob/main/lessons/203/rust-app/Dockerfile#L7
I strongly suggest to reupload this very same test by adding below lines in rust-app's `Cargo.toml` file ``` [profile.release] lto = true strip = true opt-level = 'z' panic = 'abort' codegen-units = 1 ``` I am very curious to see the after results...
thank you! I'll try it. Also someone raised a PR to improved rust code, let me know what do you think? - github.com/antonputra/tutorials/tree/main/lessons/203/rust-app-v2
sure, i have a golang client config: stages: - clients: 5 intervalMin: 5 - clients: 10 intervalMin: 5 - clients: 15 intervalMin: 5 example: func mainTest(m *metrics, client *http.Client, c Config) { for i, stage := range c.Stages { guard := make(chan struct{}, stage.Clients) // Set Prometheus number of clients metric m.clients.Set(float64(stage.Clients)) // Set Prometheus test stage metric m.stage.Set(float64(i)) now := time.Now() for { guard
I think implementation is wrong if they start return errors so quickly. Also rust response time around 1ms for empty response raise a questions about debug mode or something like this.
IMHO, Go wins. 🎖 Go uses three times more memory but four times less CPU. I think Go is the better choice, since the performance is only in memory usage which doesn't actually translate to more requests per second, and also the same app in Go has better latency (although I'm not sure if it noticeable). Thanks for the video.
Well, I got a lot of tips and a couple of PRs to improve my Rust code. I'll publish the results in a few days. It can beat Go's standard library, but even Rust experts weren't able to optimize the AWS SDKs to make them more efficient than Go's AWS SDK v2 🤷♂️
@@AntonPutra They actually did and according to their official blog Rust is faster, I can't post links because my comments get deleted, but Google search can give you top level results.
@@AntonPutra sounds great! Hmm well yes possible the AWS lib just isn’t there yet but then that’s an implementation issue on AWS side. No wonder considering how they create their sdks 😬
i got a lot of tips how to improve rust code as well as few PRs, will release updated results in a couple of days. that particular issue was due to absence of timeouts on the client side
@@AntonPutra I would like to see you test both if you can, other than that I do not know which version would be better. Maybe 253 first though as I see more optimizations have been made.
Go Fiber is cheating with HTTP1.1 which is why I use it for my high volume CMS. I don't see the landscape of Web Servers changing much until HTTP3 QUIC becomes standard. Honestly any Web Service is going to be winner by simply using Go or Rust over Python or Node (compiled over interpreted).
use let now = Instant::now(); and now.elapsed() for rust for benches. not SystemTime. It's quite strange why it uses so much cpu at the begining. need to dig into sources
So go is better choice here as writing code is easier and it takes more memory than rust but others are similar or better can you do one with erlang vm lang like elixir or gleam vs go or rust
One important thing that these benchmarks doesn't take into consideration when choosing a programming language or framework is the developer feedback loop. For instance, I know that python is probably one of the slowest (in execution time) programming languages that are used for web applications. However the developer feedback loop is such a big difference compared to the compiled languages, that I would have it as my default go to language.
► What should I test next?
► AWS is expensive - Infra Support Fund: buymeacoffee.com/antonputra
► Benchmarks: th-cam.com/play/PLiMWaCMwGJXmcDLvMQeORJ-j_jayKaLVn.html&si=p-UOaVM_6_SFx52H
Elixir vs Go
@@qizhang5749 in the pipeline!
C# vs Java - this will be interesting. Please have in mind that C# can be compiled as a fully native app (via AOT) and PGO mode can be set to full. It would be interesting to see how much of a difference it can do.
@@Autystyczny noted!
This vid is 🔥🔥! Thank you , You're the GOAT! 😎💯 Just subbed and can't wait for more! 🙌
Great work (as always), Anton. Your visual explanations are top notch.
thank you!
This video is just a goldmine of information, I love that you added some solid background, which - to be honest - could have been a separate video
thank you! :)
finally some good non micro benchmarks that test realistic use of the languages. elixir vs go would be interesting to see
thanks! will do elixir as well soon
This is great insight... CPU usage remains low but end users are seeing massive spikes in latency. I know in the past i've made the mistake of viewing low resources usage equating to faster processing or higher throughput. In reality its just one piece of the puzzle. Great video
Amazing work on this video. This is super useful data and answers a lot of questions I've had about rust performance and I commend you even if this video doesnt get millions and millions of views.
thanks! actually i got a lot of tips how to improve the Rust code, i'll retest and release results in a few days
@@AntonPutra I'm looking forward to this
Thanks for the great content! I noticed around 1:38 that the calculation for availability was mentioned as the ratio of failed requests to total requests. I believe it should actually be the ratio of successful requests to total requests (availability = successful requests / total requests). Just thought I'd share this in case it helps clarify things for others. Keep up the great work!
thanks for the feedback, actually i have correct query i just misspoke. In the query i use
status!~"[4-5].*" to filter out failed requests
this is a great performance test.
well done
thank you!
Thanks for making this video. I too was surprised by the results: why do they show Rust so much slower (latency) and why does it consume more CPU and fluctuating level of CPU (there is no GC to add periodic overhead)?
I took the code and ran test 1 (static get devices) locally (M1 Mac Studio) to see if I could repro your results. I found something different. Both Go and Rust performed incredibly similarly: about 570us per request... But I was driving it with 50 concurrent threads achieving about 27k requests per second. I get similar results at 100 concurrent threads (50k requests/s).
Further, with these tests I see Rust taking 55% CPU / 9MB RAM consistently, whilst Go is fluctuating 130-180% CPU / 20 MB RAM during the run.
My results are not directly comparable with yours (not network between the client and server, different machine, etc)... but the relative performance is comparable. I see latency parity between the two languages. This is what I would expect for such a simple test... returning static data, both compiled and optimised languages. As expected I see Go using more memory and a little more CPU to cover off its overheads like GC.
Can you double check your Prometheus graphs to make sure the CPU usage graphs are the right way around?
I can't come up with a suggestion for the differing latency figures... Any ideas Anton?
thank you for taking the time! The issues was a client running in the Kubernetes pod (job), i have improved test design and now can achieve ~20k RPS in Kubernetes with 2cpu and 256mi memory. it does not mean previous tests were incorrect it just client wasn't able to generate more load
Also, someone raised a PR to improve Rust code, so I will retest and if it makes a difference release a result - github.com/antonputra/tutorials/tree/main/lessons/203/rust-app-v2
@@AntonPutra Thanks for replying. I look forward to seeing the updated results.
Coming back to the CPU graph. I saw from your video today that all three Go frameworks generated load fluctuations. Your graphs for test1 in this video show Go with almost completely level/smooth CPU and Rust with quite variable CPU usage. Can you confirm that the graph key is the right way around...
That said, I am super impressed with the performance of Go in these tests.
@@FrankTaylorLieder I'll double-check. This time, I generated the load from 20 different pods in K8s instead of just one. Also, I added random sleep in the Go client, which could potentially cause spikes. When I reduced the random sleep time interval from 1000ms to 40ms, the graph looked much smoother.
Old go client - github.com/antonputra/tutorials/blob/main/lessons/201/client/request.go#L16-L18
Updated Rust code - github.com/antonputra/tutorials/tree/main/lessons/203/rust-app-v2
Thank you for the double check
Why services cannot handle more than 1.6k requests in Test 1 when we still have CPU/RAM resources available?
Great question. I'm wondering about that too
@@baxiry. Could you elaborate more, please? I understood that all packets reached services via network, but they were "dropped" i.e. HTTP 503 is returned ?
Maybe there was a low timeout in the client?
@@pi3ni0 he couldn't. isn't network handled by kernel anyway?
let me know what metrics to expose next time... I'll try network, maybe packets will see..
@@AntonPutra i think it's less about metrics and more about conclusions from the benchmark
Love the comparison great to see this. Amazing work.
thank you!
PLEASE Go need to rest for the next battle 😂. Great work 🎉
haha
😄
🤣🤣🤣
No No please 😂
This is helping ALOT
Please keep it up Anton
Wow! I loved this video! Thanks!
I started to love histogram data for horizontal scaling...
thanks!
Watching these tests make me realize that I made the best decision to go all in with Go!
Thank you, Anton!
Great work as always.
thanks, but i'm getting a lot of tips how to improve Rust code as well as couple of PRs. so I'll retest and release results soon!
PR - github.com/antonputra/tutorials/pull/253
PR - github.com/antonputra/tutorials/pull/252
@@AntonPutra The same can be argued about the Go code as well. I am quite certain that if you tweak the Go code here and there, it could be a lot faster. But as for Go being a garbage collected language and going toe to toe with a non garbage collected like Rust is very impressive.
Anyways, keep them tests coming. You actually made me revisit the thought of learning Rust in the future ;)
@@AntonPutra I'm really looking forward to a fair comparison
@@carlosm.1233 it's because his code is not optimzed enough. Rust can easily beat GO if he uses good code. it's his skill issue.
Great content as always! Keep them coming!
One suggestion I recommend is comparing between cloud providers as well when doing these tests.
thanks, noted! I'll see if it makes sense i'll use gcp as well or maybe azure
@@AntonPutra all 3 would be great
Nice video and I appreciate your approach.
Would love to see some Nim in the mix
thank you! noted!
I think irrespective of the result, the Go performance is extremely good, for quite a high level language.
It would be really interesting to see a top 3 most performant server frameworks for bother Go & Rust, then put them against each other. Quite a few people are saying Axom for Rust & probably FastHTTP would be expected to be the fastest for Go?
I like that you consider real world usage & not just simply number of requests for a hello world endpoint, much more realistic.
thanks! i'm going to release a new benchmark GO (stdlib vs fiber vs gin) in couple of days, next Rust (actix vs axum vs roket), let me know if i should consider different frameworks
@@AntonPutra I will be keen to watch that one, great!
I would definitely love to see Echo & FastHTTP tested for Go at some point 😊
Excellent video yet again. Can you do golang / rust vs c++? Also, would like to see more real world cpu extensive task like compression and encryption
thank you! will do!
can you please make a video about how to benchmark and monitor a backend service effectively
ok, at some point! but i already have 5-6 videos on how to monitor/instrucment stuff
Can I suggest a short video showing how you created those custom dashboards in grafana 👀?!
i have a few pls take a look:
- github.com/antonputra/tutorials/tree/main/lessons/130
- github.com/antonputra/tutorials/tree/main/lessons/134
- github.com/antonputra/tutorials/tree/main/lessons/135
- github.com/antonputra/tutorials/tree/main/lessons/136
- github.com/antonputra/tutorials/tree/main/lessons/137
- github.com/antonputra/tutorials/tree/main/lessons/137
- github.com/antonputra/tutorials/tree/main/lessons/140
@@AntonPutra Thank you for your efforts and time ^^.
@@AntonPutra Hi! Does it possible to get metrics by path from nginx ingress controller? Can't find any video on this on your channel. I found only setup with fluentd that transform nginx logs to prometheus metrics. Is there any easier way to achieve that? I'm using AWS EKS with kube-prometheus-operator and wanna see nginx metrics on different path of my app like /api/some_method
I spent some time on Google, and as I understood it, this is only achievable if we specify many different paths in the Ingress K8S resource. In this case, we will see these paths in our metric. But we always need to update this configuration whenever we add a new endpoint to our application...
Now we need comparison with zig zap framework 🧐
haha, will do!
I think firstly he should fix implementation, it's defentelly wrong results.
Probably the AWS Rust SDK is still a littlebit undercooked.
yeah i know, especially for rust. they recently deprecated rust runtime for lambda i think they don't spend a lot of time on maintaining and improving rust sdk
I would like to see Elixir (or Gleam) vs Go
will do!
+1 dor Exilir
Having seen these results - I think I might put in some pull requests for Linux drivers written on Golang! (topical Geek joke).
It was fascinating to see two compiled languages going head to head. I was surprised to see them being so neck and neck, Memory usage was the principal difference, and on that I have some comments (in no way a criticism of the video which was illuminating and objective, in running the two languages against one another in default configuration - like most people use them).
Memory usage is not really an efficiency metric, because a garbage collecting language will only carry out a full garbage collect when necessary. Necessary being hitting the defined threshold. Typically this would be configured to 80% of available memory - you can see this when when the load increases and there is an initial spike.
Resource pools and statically sized queues are also a reason that memory usage of an application can appear initially appear high, but then not get any higher.
As an embedded software engineer, the approach was that memory is not there to be squandered but it is there to be used and abused, to keep down latency.
Caching previously loaded and potentially still useful disk pages in RAM is another example of this. This is something the OS will do for the app if you let it - but that usage has to be accounted for against one process or another.
So, the takeaway is - memory can be traded for latency - and you can code or configure an app to play to either of these as priority dictates. If you know the load in advance you can vary the budgets considerably.
Caveats : I have had to say "the compression algorithm will take up more room than the data" and databases often default to linear searches because using an index is slower than just going through the data (the point being that you shouldn't assume complex algorithms make things quicker). IOPS on the database store can be an almost invisible bottleneck.
I'd like to see a Node.JS app added to the mix. The compiled languages people expect it to be rubbish, but I suspect it would run Rust and Golang close on this app (if you are going to allow "go routines" you would have to allow multiple Node instances listening on the same port).
thanks for taking a time to leave this feedback i'll definitely use nodejs in the future benchmarks. by the way i got some PRs to improve my Rust code it i just release results which very different from what i had before
from long time waited for this ...
thanks!
Go AWS SDK v2 is one of the most completed and well inline-documented implementations of the SDK and the S3 (and S3 manager) is the biggest module, so I guess the AWS put a lot of effort optimizing it.
i guess so, but i got a lot of feedback how to improve rust code so i'll retest it and release in a couple of days
would like to see Axum vs Actix vs Fiber but maybe too late for that? or maybe another episode with more metrics? I am not sure myself what additional metrics people want to see.
thank you, noted! i'll release a short benchmark Go (stdlib vs fiber vs gin) maybe tomorrow or after tomorrow. probably will do Rust (actix vs axum vs roket) next
@AntonPutra i sus pect actix will be more performant than axum but Axum has better dev experience? As axum is marketed to jave better ergonomics. But will be nice to see avtix and axum at scale, just to have it in the video books.
A full baterías comparison between a Python framework like Django (industry standard) vs some go alternative would be interesting to see
thanks! noted will do django soon!
You should avoid python and performance in the same sentence
I see a lot of Go videos now, now lets try PHP vs Go!
one more coming Go (stdlib vs fiber vs gin) tomorrow or after tomorrow :) but thanks, added to the list!
Amazing comparison 👏
thanks! :)
It's very interesting to see comparisons like this, especially with Actix and Fiber, which are among the fastest tools for backend development in Rust and Go, respectively. The specific requirements of each project can heavily influence the choice of tool. While Fiber boasts lower latency per request and less CPU usage, it does consume more memory. On the other hand, Actix might have higher latency and use more CPU, but it's significantly more memory-efficient. Considering that memory is more expensive than CPU usage in cloud services, this trade-off becomes even more crucial if you are deploying the service on-premises or in the cloud. Additionally, Go's ease of use makes it simpler to onboard new developers, even those without prior experience, whereas Rust might have a steeper learning curve. It's always a good practice to encourage developers to conserve memory allocation when working with Go, given its tendency towards higher memory consumption. These subtle observations can significantly influence the choice of tool depending on the project's specific requirements.
i've been using Go for the client to generate load and had to use 8xlarge instance, when i switched to Rust i only need 4xlarge ec2, 50% saving for me to run these tests :)
Good one! Really disappointed in rust😢 hope to see better results from axum
just got pr to improve rust, will see if it make any difference - github.com/antonputra/tutorials/pull/251
Спасибо за видео, познавательно и неожиданно :)
spasibo :)
It would interesting to change Actix by Axum, for Rust.
next time i just run test between those two, if you think i should instrument anything specific like how long it takes to serialize json for each framework let me know
@@AntonPutra In fact I have saw reports from Discord and why they have change from Go to Rust.
Their graphics do not show the same behaviour mostly with CPU, that is why I was wondering if
it was, may be , a matter of framework. Will see ... Anyway your work is 👍 . Congrats!
@@LinuxForLife I saw it a while ago, but I hope someone from Discord can explain their use case where Rust performs better so that I can reproduce it. I'm not hiding anything, the source code is public. It's just examples from the official docs of the SDKs and frameworks I use🤷♂
Awesome vid
thanks!
Fascinating results, Go's latency and CPU usage is actually lower than that of Rust in the two tests. Awesome remark about not scaling pods horizontally based on CPU and Memory alone, as degradation happened before they even started to peak.
thank you!
Great video, -If I could, I would suggest doing the same test comparing golang, C# and java given the popularity of those and the fact they are GC- (You did videos for this as well, nice!).
For this test, for me, the CPU usage is weird, I was expecting to be the same or rust using less CPU, maybe golang is using better compiler optimizations?
Excellent video!!!
thanks :)
Hi there, wonderful work, i would like to see Free Pascal (mormot ) vs go or Rust, since free pascal mormot qualifyes 12º in tech empower metrics, cheers
thanks for the suggestion, noted!
This is wild!
haha
Great test! Is Grafana able to capture and display spans and traces like Datadog, if you're familiar with that APM? For example, an API call is made to web service A, which in turn calls web service B that persists some data to a database. I'm interested in Grafana's ability to display the chain of these calls, their durations, the API hostname, URL, parameters, etc, on a time-line, or in a waterfall view. In Datadog, this is known as a "trace". Does Grafana have a similar concept?
thank you! yes it can, i have tutorial tracing with open telemetry - th-cam.com/video/ZIN7H00ulQw/w-d-xo.html
Axum is number 1 for many months now, yet every youtube video is still with actix-web...
is it better?
@@AntonPutra AUXM is better ,i always use it!
@@AntonPutra it's kinda better, but probably using axum won't show significant benefits here. While the only interest here was it built in release mode?
I see release mode in sources
What is the implication to be on release mode?
Nice one Go we are proud of you 😂😂😂😂
haha
Awesome, thanks 💖👍🙏🙏🙏
thank you! :)
Please do Axum vs Actix comparison.
Please do Axum vs Actix comparison. I got a lot of feedback how to improve this one as well as few PRs. I'll release update rust vs go maybe tomorrow or after tomorrow
Thank you for the tutorials, Anton! Would you upload client code and monitoring manifests to the github for this lesson?
i have old version of the client - github.com/antonputra/tutorials/tree/main/lessons/201/client
as well as monitoring staff - github.com/antonputra/tutorials/tree/main/lessons/202/monitoring
i keep iterating over and in some time in the future release it
The scales are a bit deceiving. They look more drastic than they really are. Considering how much less time it takes to write a program a Go language with concurrency, I would choose Go. We're talking about a 10% difference in performance right? Unless you're facebook, google scale it's good enough.
agree, go is simpler and easier to find developers to hire
Please do PHP Swoole vs Go and Rust
noted! will do!
Hi @Anton, You have a very great content. Can you please help me which software you use for creating these animated videos.
thanks! sure it's asobe suite
Not an expert but I think the Rust code can be slightly optimized: get_devices seems to allocate the vector every time it runs, which can be time consuming, and also you might squeeze a bit more performance with parking_lot and RwLock instead of Mutex. Another interesting approach would be to use message passing for updating the metrics instead. Other than that, great work!
i tried just to hit /healthz endpoint with "ok" string, same results, nothing to optimize there :)
I don't mind to give Go a little bit more of memory ;)
me either, haha
damn! That's why I love Go!
haha
Great job
thank you!
@@AntonPutra I wrote our backend in golang for 3 years now.
@@bitcoinxofficial time to rewrite it in rust😂 just joking
@@AntonPutrait is better to stick with the language that you expert for building products rather than investing in every new languages or technologies. Because we just need a final product in the end.
@@bitcoinxofficial true
oh, next one can do compare real time use case ? like large amount of messages to process
yes, i was thinking about kafka consumer/producer, maybe simple etl pipeline etc
can you make rust (axum) vs go(fiber). or other with comparing rust (axum)
yes next
Rust version probably spawns number of threads from the start, hence higher CPU usage. It would be more comparable to see Rust using Tokio for true concurrency but I have no idea what Actix uses behind the scenes. I would want to see you compare it with more popular framework like Axum.
thanks, i'm getting a lot of tips how to improve the Rust code so I'll retest and release results soon. I can also measure how many treads each app creates with this - "container_threads"
Github - github.com/google/cadvisor/blob/master/docs/storage/prometheus.md#monitoring-cadvisor-with-prometheus
Nit: degredation is a word but it comes from degrade/degraded not degredate/degredated.
Go beats rust? well that was unexpected. what could be the reasons behind this? better implementation of the frameworks core functionality in go?
i got a lot of feedback how to improve Rust code as well as couple of PRs, so I'll retest in a few days and release resutls
I am so in love with GO ❤❤
i'll try assembly next time, lol
Fiber vs Axum would be nice
thanks for the suggestion, noted!
get_devices from main has the potential to be wildly inefficient. It's not even so much about the vector not being const (uuid! is validated at compile time). The problem may be in the mac and firmware fields of the Device struct. With the current implementation it's a whole wonder why Device has owned data as opposed to string slices. What you end up doing is paying the price for a heap allocation (as stated in the docs for From for String) which you don't need since this data can be a &'static str. On top of that one doesn't need UTF-8 strings to represent MAC addresses and firmware versions. Using an ascii string is something that both solutions would benefit from.
Hopefully rustc is smart enough to figure all of these things out and not include all of the above mentioned overhead, I haven't analyzed the function that closely.
Thank you for the feedback, actually one of the PRs that i received is to change uuid to string - github.com/antonputra/tutorials/pull/252
as well as another one to improve performance - github.com/antonputra/tutorials/pull/253
I'll retest in a couple of days along with some other tips and release results
can you do a similar benchmark but between go and zig? that would be interesting to know which one would fare better in the long run
Hey! Sorry about stupid questions.
If we use optimized OS, like Clear Linux or CachyOS, will we see some benefits?
I saw different tests on Phoronix, and it looked like it can help. One PHP benchmark showed about 2x speed (but I think it was just benchmark without i/o).
But those tests were launched without containers, if I not mistaken.
possibly, but most people use managed Kubernetes clusters like EKS (aws), GKE (gcp) and AKS (azure) which comes with pre built AMI (os images). i just want to cover the same use cases and environments that most people would use
@@AntonPutra interesting, especially about Clear Linux. I have to check, how Intel use it, I'm sure that they made it for servers 🤔
It's strange if cloud providers don't want to get some benefits
@AntonPutra could you try Swift vs Go?
yes, will do!
I think this shows that Discord was wrong to throw their entire codebase away for their "performance" issues with Go. There was something else going on there IMO and not just "Go slow we move to Rust haha"
I had a Golang client that generated load, and I had to use 8xlarge instances. After rewriting it in Rust, I was able to use 4xlarge instances, 50% infra savings for me for running these tests.
GOat
Hi Anton, I'm still waiting for your monitoring setup in docker compose file
yeah i know :)
take a look at these tutorials for now
Monitoring EKS & EC2 instances with MANAGED Prometheus & Grafana - github.com/antonputra/tutorials/tree/main/lessons/130 How to monitor Persistent Volume usage in Kubernetes using Prometheus? - github.com/antonputra/tutorials/tree/main/lessons/134 How to monitor Containers in Kubernetes using Prometheus & cAdvisor? - github.com/antonputra/tutorials/tree/main/lessons/135
Monitor HTTP/REST API in Kubernetes using Prometheus & Nginx Ingress Controller - github.com/antonputra/tutorials/tree/main/lessons/136
How to Monitor Golang with Prometheus (Counter - Gauge - Histogram - Summary) - github.com/antonputra/tutorials/tree/main/lessons/137 How to Monitor Nginx with Prometheus and Grafana? (Install - Monitor - Fluentd) - github.com/antonputra/tutorials/tree/main/lessons/140
Prometheus Operator Kubernetes Tutorial - github.com/antonputra/tutorials/tree/main/lessons/154 How to Build Custom Prometheus Exporter? - github.com/antonputra/tutorials/blob/main/lessons/141
OpenTelemetry Golang Tutorial (Tracing in Grafana & Kubernetes & Tempo) - github.com/antonputra/tutorials/tree/main/lessons/178 Take a look at all lessons here in the readme - github.com/antonputra/tutorials/blob/main/docs/contents.md
Suggestion: Can you please change the graph colors to something more distinguishable. Although I can identify the colors but some people may not and its also easier to check the difference.
ok will do! those are the defaults grafana uses
@@AntonPutra Yes I thought so.
i interesting same test
Actix (Rust) vs Fiber (Go)
1. Simple requests (without body or very small static body)
2. Simple requests (with body 1kb 100kb 1mb) [need read all data request to io.Discard]
3. simple request with database
can you do this?
@Anton What needs to change to allow the application to keep serving until it bumps against CPU or memory limits??
i guess increase the timeout and also increase the number of clients that generate load
Why did not you use release profile settings geared towards performance ? Enable LTO=thin and codegen-units=1. Also these results does not seem correct. Did you compile the rust app in release mode ?
I got a couple of PRs as well as suggestions like this to improve performance. I'll retest it in a few days and release results. thanks for the feedback!
PR - github.com/antonputra/tutorials/pull/253
PR - github.com/antonputra/tutorials/pull/252
and yes it was compiled in release mode - github.com/antonputra/tutorials/blob/main/lessons/203/rust-app/Dockerfile#L7
In B4 crustaceans say you did it wrong.
i got a lot of feedback how to improve Rust code as well as couple of PRs. i'll retest in a couple of days and release results
I strongly suggest to reupload this very same test by adding below lines in rust-app's `Cargo.toml` file
```
[profile.release]
lto = true
strip = true
opt-level = 'z'
panic = 'abort'
codegen-units = 1
```
I am very curious to see the after results...
thank you! I'll try it. Also someone raised a PR to improved rust code, let me know what do you think?
- github.com/antonputra/tutorials/tree/main/lessons/203/rust-app-v2
Can u explain how you are scaling the clients and number of requests.
sure, i have a golang client
config:
stages:
- clients: 5
intervalMin: 5
- clients: 10
intervalMin: 5
- clients: 15
intervalMin: 5
example:
func mainTest(m *metrics, client *http.Client, c Config) {
for i, stage := range c.Stages {
guard := make(chan struct{}, stage.Clients)
// Set Prometheus number of clients metric
m.clients.Set(float64(stage.Clients))
// Set Prometheus test stage metric
m.stage.Set(float64(i))
now := time.Now()
for {
guard
I expected to have go vs c# with AOT
will do, next week probably
Please make a video comparing dart(vania) and go(whatever)
ok noted
Please make a video about Go fiber VS Java Helidon
thanks, noted!
in the real world, which better for resources efficiency? better cpu usage + latency or better memory usage?
latency, use experience is always comes first!
I think implementation is wrong if they start return errors so quickly. Also rust response time around 1ms for empty response raise a questions about debug mode or something like this.
i got a lot of feedback as well as few PRs how to improve Rust code, will release results in a day or so
I would very scared just to use those instances 😅
yeah, the secret is to setup alarm to delete them, lol
Would be great to try with Axum
will do little bit later
i think axum is a much lighter framework and more closer to fibre
will test!
java/spring vs c#/dotnet, thank you
thank you! noted!
Can you do a Rust vs C++?
yes soon
Isso é maravilhoso [portuguese pt-br]
thanks!
can you test with axum actix and rocket?
yes, but first i want to compare fiber vs stdlib vs gin or chi
IMHO, Go wins. 🎖
Go uses three times more memory but four times less CPU. I think Go is the better choice, since the performance is only in memory usage which doesn't actually translate to more requests per second, and also the same app in Go has better latency (although I'm not sure if it noticeable).
Thanks for the video.
thank you! i got a lot of tips how to improve Rust code, will retest it in a few days and release results!
Can u test go/rust vs userver (c++)
yes soon!
Go is amazing! But whenever you see C/C++/Zig being beaten by Go/Java/etc. you know something’s wrong…
Well, I got a lot of tips and a couple of PRs to improve my Rust code. I'll publish the results in a few days. It can beat Go's standard library, but even Rust experts weren't able to optimize the AWS SDKs to make them more efficient than Go's AWS SDK v2 🤷♂️
@@AntonPutra They actually did and according to their official blog Rust is faster, I can't post links because my comments get deleted, but Google search can give you top level results.
@@AntonPutra sounds great! Hmm well yes possible the AWS lib just isn’t there yet but then that’s an implementation issue on AWS side. No wonder considering how they create their sdks 😬
can you test fastapi and some other library or framework?
yes, next test
i don't understand at 1.5k RPS if they are using only 10% Memory, 8% CPU then why they are falling?
i got a lot of tips how to improve rust code as well as few PRs, will release updated results in a couple of days. that particular issue was due to absence of timeouts on the client side
did you build the rust app in release mode?
yes, and i got a few PRs to improve rust code, will release results soon
@@AntonPutra thanks
@@AntonPutra Also consider `opt-level = "s"` and `strip = true` for `[profile. Release]` in Cargo.toml file.
Why don't you accept PR and test improved Rust code?
which one? I already merged 2 PRs.. I'll test soon..
PR - github.com/antonputra/tutorials/pull/253
PR - github.com/antonputra/tutorials/pull/252
@@AntonPutra I would like to see you test both if you can, other than that I do not know which version would be better. Maybe 253 first though as I see more optimizations have been made.
FastAPI vs Robyn
thanks! noted!
Go Fiber is cheating with HTTP1.1 which is why I use it for my high volume CMS. I don't see the landscape of Web Servers changing much until HTTP3 QUIC becomes standard. Honestly any Web Service is going to be winner by simply using Go or Rust over Python or Node (compiled over interpreted).
thanks for the feedback! fiber seems to be very fast, i'm going to release Go (stdlib vs fiber vs gin) in a couple of days
use let now = Instant::now(); and now.elapsed() for rust for benches. not SystemTime. It's quite strange why it uses so much cpu at the begining. need to dig into sources
So go is better choice here as writing code is easier and it takes more memory than rust but others are similar or better
can you do one with erlang vm lang like elixir or gleam vs go or rust
thanks, noted!
Check pls ,kotlin ktor and go
thanks, noted!
One important thing that these benchmarks doesn't take into consideration when choosing a programming language or framework is the developer feedback loop.
For instance, I know that python is probably one of the slowest (in execution time) programming languages that are used for web applications. However the developer feedback loop is such a big difference compared to the compiled languages, that I would have it as my default go to language.
i like python as well, you can write data pipelines with spark and ml stuff so python is not going anywhere
0:22 "against GIN for go"
1:01 "against go FIBER framework"
bro couldn't pick
:)