Just a heads up - we reference a bunch of other videos throughout this one; we've added the appropriate TH-cam info cards but for the past year or so they've not worked properly. No idea why. So, every other video we reference: Akka.Cluster Simply Explained: th-cam.com/video/8PenRoEjZKc/w-d-xo.html Distributing State Reliably with Akka.Cluster.Sharding: th-cam.com/video/2apFt9v0Vjw/w-d-xo.html Split Brains Explained: th-cam.com/video/aTu7WUJfGo8/w-d-xo.html Consistent Hash Distributions Explained: th-cam.com/video/byL_Cs0dGO0/w-d-xo.html Introduction to Petabridge.Cmd: th-cam.com/video/b7Qxg2YiOTI/w-d-xo.html Introduction to Phobos 2.0: th-cam.com/video/feExYBcqAtc/w-d-xo.html Introduction to Akka.Hosting - HOCONless, "Pit of Success" Akka.NET Runtime and Configuration: th-cam.com/video/Mnb9W9ClnB0/w-d-xo.html These, the original GitHub issues we fixed, and more are also linked in the description.
Great stroy, learned a lot from the journey, thanks! One thing to clarify, in the conclusion page, I think the terminology "smoke test" shoud be rephrased to "Integration tests" or "auto test", as "smoke test" means specific to a test finish in a seconds or so.
Nice video, Aaron! Just one thing that wasn't clear to me in the end: in case the split brains situation happened to my system, before the fix, what should I do to remedy it? Just restart the server?
That's a great question - so you'd need at least a 3-node cluster in order to even run into this problem (something I probably should have mentioned in the video) and the issue would be that at least two of the surviving nodes would have duplicates. The most robust solution would be to SLOWLY restart both of them - waiting at least 20 seconds between each. That'd remedy the current situation and prevent it from happening again. Detecting the duplicate would be the hardest part, probably, unless you had good OpenTelemetry support that could prove that more than 1 instance of the same entity actor was alive concurrently.
Just a heads up - we reference a bunch of other videos throughout this one; we've added the appropriate TH-cam info cards but for the past year or so they've not worked properly. No idea why.
So, every other video we reference:
Akka.Cluster Simply Explained: th-cam.com/video/8PenRoEjZKc/w-d-xo.html
Distributing State Reliably with Akka.Cluster.Sharding: th-cam.com/video/2apFt9v0Vjw/w-d-xo.html
Split Brains Explained: th-cam.com/video/aTu7WUJfGo8/w-d-xo.html
Consistent Hash Distributions Explained: th-cam.com/video/byL_Cs0dGO0/w-d-xo.html
Introduction to Petabridge.Cmd: th-cam.com/video/b7Qxg2YiOTI/w-d-xo.html
Introduction to Phobos 2.0: th-cam.com/video/feExYBcqAtc/w-d-xo.html
Introduction to Akka.Hosting - HOCONless, "Pit of Success" Akka.NET Runtime and Configuration: th-cam.com/video/Mnb9W9ClnB0/w-d-xo.html
These, the original GitHub issues we fixed, and more are also linked in the description.
Great stroy, learned a lot from the journey, thanks! One thing to clarify, in the conclusion page, I think the terminology "smoke test" shoud be rephrased to "Integration tests" or "auto test", as "smoke test" means specific to a test finish in a seconds or so.
What an odyssey! This is a tour de force kind of video alright
Thanks!
Nice video, Aaron!
Just one thing that wasn't clear to me in the end: in case the split brains situation happened to my system, before the fix, what should I do to remedy it? Just restart the server?
That's a great question - so you'd need at least a 3-node cluster in order to even run into this problem (something I probably should have mentioned in the video) and the issue would be that at least two of the surviving nodes would have duplicates. The most robust solution would be to SLOWLY restart both of them - waiting at least 20 seconds between each. That'd remedy the current situation and prevent it from happening again. Detecting the duplicate would be the hardest part, probably, unless you had good OpenTelemetry support that could prove that more than 1 instance of the same entity actor was alive concurrently.
Fixed a ".NET bug" - Do you mean a bug in .NET Runtime? Is this a clickbait title?
It's a bug in a popular .NET distributed programming framework? How on earth would that title be clickbait?