I have been trying to access various resources to understand events and your videos are one of the best explanations out there. I also appreciate how you reply to almost all comments. Thank you!!!
Glad it was helpful! I try to reply to as many as I can assuming I don't miss them. Replies are harder to keep track of though. Sorry If I don't reply to a reply :)
Great video! What I'm struggling with is the best way to for service A to get data from service B, that it might have missed or perhaps wasn't interested in when service B sent out a state change or maybe service A didn't exist at the time service B was publishing state changes. Now service A is interested but service B has no reason to publish state again. Service A could ask service B to publish all its state again but this might publish more than service A needs and would maybe cause other services to consume the state message unnecessarily as well. Anyway, hope that makes sense, maybe you have a video that addresses this :)?
If you’re using Kafka and have something like the customerId as the partition key, then you don’t have issues with competing consumers pulling from that stream. You can take it a step further and create a ksqldb query to define the local cache/table and it should get updated automatically (I’m working on a POC of this nowish).
Hi Derek, what are your thoughts on using this technique for events that need to be aggregated? e.g. some service needs info on payments (total amount) that happened since some time? API call could be used to build initial state, then use events to modify state?
As mentioned at the later portion of this video, you should have data where you need it. In other words, direct commands to the service that should own the data. Propagating reference data via event carried state transfer can make sense, especially if you want to do some type of view/ui composition. But if you need data in a service for business rules or to perform a command, it should own it.
I like your videos. These are very informative. Question: when service A calls Service B and waits until it times out or gets data back, does this mean Service A is blocked and cannot be used/called by any other process?
What do you think about trying not to duplicate the data at all, just some sort of correlation ID and use UI composition to display stuff from various services. I've seen this approach on few conferences and trying to implement this in some side projects. Seems like a very hard thing to achieve but not impossible I guess 🤷
Yes, doing the UI composition on the frontend. There's also the idea of having the client make a single request and have a BFF (backend for frontend) do the composition there. Ultimately it's going to happen somewhere.
@@CodeOpinion API composition on the frontend is not a solution that is always available. For example, if I have a stock-service and a product-service with millions of records and I want to query red products that are in stock, it is not going to work. The composition would take way too long given that there are millions of resulting product IDs from both sides that have to be retrieved and matched against each other to get the products that the user wants. In this case, duplicating data in a service that is dedicated to searching products would be a viable solution. This approach is scalable and you can also select a database system that is better suited for the job (searching products with complex parameter combinations).
For example, most of databases have replication workflows, wherein when new node(s) are added to db cluster, some empty replicas would be allocated on newly added node(s). And there would be a bulk transfer from existing replicas to newly allocated empty replicas. Similar approach can be followed
I wonder what are your experiences with the "versioning": do you use timestamps? Counters (if yes, then how they are implemented)? I remember probably from Martin Kleppmann's book that relying on timestamps can be prone to errors, as the clocks on different machines can be set up differently, plus other time-related issues.
I know this was mentioned in the video but you could use FIFO on the broker for this, as long as you're not using competing consumers within the same scope. For example, with Azure Service Bus you can use sessions on a topic to ensure processing is ordered correctly. You can still process e.g. multiple orders (or whatever you choose as your session scope) at the same time, but you can ensure that the consumer is only performing one operation on any given order at a time.
Cool video. Let's say we have the pattern in place. Now we want to introduce new service that we want to cache the same data (or subset of it). What's the best way to approach it? Reply all the events? Manual export/import? sync calls?
Ok, what can you say about meta information in this context? For example authentication information, or smomething like that. Do I need put it into a message? And how it looks like?
One recommended solution for that is for the event to only emit its name and a URL where the event data can be found. The URL can then be authenticated and authorized as any other HTTP service according to who the client requesting the data is.
@@asbjornu That sounds pretty complicated. You need to take care of saving the event, generating the URL, serving the requests. You couple your consumer to the availability of the producer and you will burden the producer with traffic from the consumers. I would probably encrypt the events and only give the key to authorized consumers.
@@rcts3761 as always, It Depends. In most organizations, the origin service that emits the event will have the API with authorization in place already, so serving the event subscriber does not come at a (huge) cost. The event subscriber needs to place the received event in a queue for processing and ack it once it's successfully processed. That's a smart thing to do regardless of how the event is processed to ensure event ordering, exactly once delivery and other event- and message-related problems you need to deal with in an event-oriented architecture either way. Encryption is another option, but that requires a PKI infrastructure which is also not free.
So as long as you have some indication of version. The issue most run into with deltas is missing one or not handling it correctly. Now you don't have eventually consistent data, you have inconsistent data.
I'm also in the EU, and haven't found any problem with this pattern. When a "data owning" component deletes data, it just publishes an appropriate event and all consuming systems then respond by deleting their copy of that data.
I have been trying to access various resources to understand events and your videos are one of the best explanations out there. I also appreciate how you reply to almost all comments. Thank you!!!
Glad it was helpful! I try to reply to as many as I can assuming I don't miss them. Replies are harder to keep track of though. Sorry If I don't reply to a reply :)
Great video! What I'm struggling with is the best way to for service A to get data from service B, that it might have missed or perhaps wasn't interested in when service B sent out a state change or maybe service A didn't exist at the time service B was publishing state changes. Now service A is interested but service B has no reason to publish state again. Service A could ask service B to publish all its state again but this might publish more than service A needs and would maybe cause other services to consume the state message unnecessarily as well. Anyway, hope that makes sense, maybe you have a video that addresses this :)?
If you’re using Kafka and have something like the customerId as the partition key, then you don’t have issues with competing consumers pulling from that stream.
You can take it a step further and create a ksqldb query to define the local cache/table and it should get updated automatically (I’m working on a POC of this nowish).
Hi Derek, what are your thoughts on using this technique for events that need to be aggregated? e.g. some service needs info on payments (total amount) that happened since some time? API call could be used to build initial state, then use events to modify state?
As mentioned at the later portion of this video, you should have data where you need it. In other words, direct commands to the service that should own the data. Propagating reference data via event carried state transfer can make sense, especially if you want to do some type of view/ui composition. But if you need data in a service for business rules or to perform a command, it should own it.
I like your videos. These are very informative.
Question: when service A calls Service B and waits until it times out or gets data back, does this mean Service A is blocked and cannot be used/called by any other process?
No, not likely blocking other calls, so you're service won't be unavailable.
What do you think about trying not to duplicate the data at all, just some sort of correlation ID and use UI composition to display stuff from various services. I've seen this approach on few conferences and trying to implement this in some side projects. Seems like a very hard thing to achieve but not impossible I guess 🤷
Yes, doing the UI composition on the frontend. There's also the idea of having the client make a single request and have a BFF (backend for frontend) do the composition there. Ultimately it's going to happen somewhere.
@@CodeOpinion API composition on the frontend is not a solution that is always available. For example, if I have a stock-service and a product-service with millions of records and I want to query red products that are in stock, it is not going to work. The composition would take way too long given that there are millions of resulting product IDs from both sides that have to be retrieved and matched against each other to get the products that the user wants.
In this case, duplicating data in a service that is dedicated to searching products would be a viable solution. This approach is scalable and you can also select a database system that is better suited for the job (searching products with complex parameter combinations).
How would you handle the initial load, if you connect to the dependency late and data doesn't get updated regularly?
Most expose an API to backfill data they need.
For example, most of databases have replication workflows, wherein when new node(s) are added to db cluster, some empty replicas would be allocated on newly added node(s). And there would be a bulk transfer from existing replicas to newly allocated empty replicas.
Similar approach can be followed
I wonder what are your experiences with the "versioning": do you use timestamps? Counters (if yes, then how they are implemented)? I remember probably from Martin Kleppmann's book that relying on timestamps can be prone to errors, as the clocks on different machines can be set up differently, plus other time-related issues.
An easy way would be to use auto incrementing IDs in a transactional database with the outbox pattern, since you're probably using outbox anyway.
I know this was mentioned in the video but you could use FIFO on the broker for this, as long as you're not using competing consumers within the same scope. For example, with Azure Service Bus you can use sessions on a topic to ensure processing is ordered correctly. You can still process e.g. multiple orders (or whatever you choose as your session scope) at the same time, but you can ensure that the consumer is only performing one operation on any given order at a time.
You can use logical counter, increment it when update and check expected version to update.
Cool video. Let's say we have the pattern in place. Now we want to introduce new service that we want to cache the same data (or subset of it). What's the best way to approach it? Reply all the events? Manual export/import? sync calls?
You're comment made me create this video: th-cam.com/video/RcVf-R7RZcY/w-d-xo.html
Awesome. Very well explained, thanks!
You're welcome!
Really interesting, thanks for sharing!
My pleasure!
Ok, what can you say about meta information in this context? For example authentication information, or smomething like that. Do I need put it into a message? And how it looks like?
One recommended solution for that is for the event to only emit its name and a URL where the event data can be found. The URL can then be authenticated and authorized as any other HTTP service according to who the client requesting the data is.
@@asbjornu That sounds pretty complicated. You need to take care of saving the event, generating the URL, serving the requests. You couple your consumer to the availability of the producer and you will burden the producer with traffic from the consumers. I would probably encrypt the events and only give the key to authorized consumers.
@@rcts3761 as always, It Depends. In most organizations, the origin service that emits the event will have the API with authorization in place already, so serving the event subscriber does not come at a (huge) cost. The event subscriber needs to place the received event in a queue for processing and ack it once it's successfully processed. That's a smart thing to do regardless of how the event is processed to ensure event ordering, exactly once delivery and other event- and message-related problems you need to deal with in an event-oriented architecture either way. Encryption is another option, but that requires a PKI infrastructure which is also not free.
How does a delta vs fat work here ?
So as long as you have some indication of version. The issue most run into with deltas is missing one or not handling it correctly. Now you don't have eventually consistent data, you have inconsistent data.
04:30 What if the developer is in the EU, like me? 🥲 That's a huge GDPR issue.
Then you're not likely sharing that type of data via Event Carried State Transfer!
I'm also in the EU, and haven't found any problem with this pattern. When a "data owning" component deletes data, it just publishes an appropriate event and all consuming systems then respond by deleting their copy of that data.