Great video. One observation is that in order to send the message reliably to the queue, you'd want to use a transactional outbox pattern. This basically means you have a batch job to process enqueing outgoing messages from the database to the queue. I guess this batch process is "special" in that sense and not one that can be subject to replacement :-)
I have an upcoming project at work that this covers beautifully. I'll be using this video to pitch a delayed delivery approach to our implementation! Thanks for the awesome content!
I am curious about how this will affect message broker. Let's say we have 1M orders per day, it means that we have 1M message with delay for a week per day and most of those messages will do nothing at the end because I expect that 99% of the orders will finish successfully. Compare to this, batch job will select only orders which really didn't finish with success in defined time range. In presentation there is a chart where we can see that we removed some load done by batch job but I would also like to see how did we affect message broker by sending so many messages with delay which are quite useless. It looks to me that this delay is more useful in cases where we need to ensure that some event on the order must be done almost exactly after e.g. 15 minutes since order was reserved.
Be pragmatic. Delayed delivery isn't going to be appropriate for every situation. But to your example of 1M orders/day, you'd likely be dealing with a pretty large scale and you wouldn't be using a single database or a single broker. You'd likely be partition to some degree. While your example is valid, it comes with other concerns.
While many prefer to run long running batch jobs during non-production hours, I always keep my batch jobs simple and create many jobs to do a single task as much possible. Doing this way, based on business requirements, I define status in database and process them using cron jobs that are configured from 1 minute interval to a day or week or whatever business case demands. The only thing missing is concurrency at great numbers, but your video to cut that off to run through the day anyway. While it is a convenient thing to run cron jobs across the day with applications that are done based upon MVC architecture, where we choose to use one or more databases as needed, I am thinking about how we will use rabbitmq, to see if the order reservation expiry time is reached or not, unless a consumer is configured to 1) pick an item from the queue 2) lock it, so other consumers don't pick at the same time 3) check if expiry time is reached!!! and item is not picked up by the customer 3a) if yes, expire that order item and post the event accordingly 3b) if no, put the item back into the queue, for another check by same/another consumer in a future time Is this thought process correct?
I listened some podcast and similar topic was discussed. The reasoning for choosing batch jobs would be possibly more cpu efficient process or ability to use data lakes or data mining.
As always, no solution is a silver bullet. There are scenarios where batch jobs are not practical. As with some of my examples, using various queue priorities and the number of consumers can be away to scale. I totally forgot to mention downstream services and how they are affected by batch jobs.
As someone who has used both for this exact purpose, I must say Durable Functions are infinitely superior to Step Functions in almost every way. It's amazing how you can create sagas and workflows using "standard" .Net constructs vs the bloat and unfriendliness of something like Step functions.
Awesome, this video was perfect in terms of timing. Right now, we are researching a solution to get rid of our batch jobs that are continuously hitting our database with complex queries. However, I just have a question regarding with testing, if the deferred message is a little bit longer, let's say its 7 days ahead from now, and we wanted it to be triggered immediately for testing purposes, how would you handle that scenario? I wonder what kind of testing strategies is needed in order to tests deferred messages. TIA
This would depend on what you're using as a message transport and if you're using a messaging library, which I recommend. A messaging library that would be abstracting the underlying transport would often provide a means to test this. If you don't then you'd probably have to resort to having different configuration for the delay while testing.
That was a great presentation. Something else that I feel you could have brought up as well would have been secondary delays: pushing off the notification for a few minutes when the system utilization is above some threshold (but limiting the secondary delays to some maximum so that they do not get delayed forever). This would allow for some true smoothing of that peak, if the business case permits it. For example, maybe 10am seven days from when the original action occurred is actually peak time, so when the future event is consumed, it checks the cached server stats, and (if the CPU/memory utilization is above 75%, for example) it pushes out another delayed event for some short time later (along with a delay counter or max delay date/time).
Scheduled messages on service bus was a nightmare for me. Difficult to change the schema of scheduled messages, you need to write extra code for backward compatibility. Also difficult to change the schedule time of messages. It's unnecessary complexity for me compared to batch jobs.
Thanks for the video. I plan to use this pattern on a platform in the form of timeouts or reminders, but in our case, we need to change or remove them. In my opinion this could be required in many different scenarios. How does NServiceBus handle it?
You generally aren't removing a a message with delayed delivery however you can set an ‘Time To Be Received’ , in which after that time elapses, you don't care about processing that message.
isn’t this essentially just running the cron job every minute looking for messages instead on running it at greater intervals. I’ve used hangfire for long running processes which seems could be utilized for this.
If nservicebus is out of question, masstransit supports delayed delivery (depending on the transport queue used) and particularly interesting has sagas, which are essentially designed as state machines. Super easy to implement scenarios like expirations that way.
@@nitrovent I have used it for a while, great tool, but find its clunky to setup. Also, my eco system has multiple consumers / producers which I don’t control so keeping it simple is a must. Client rather use nservicebus which has support and documentation
As always: it depends. I see some major problems with this approach: - What happens to future events if the domain logic, that cause or handle these future events, changes? I don't see a simple solution for this problem. - The spikes will stay if the command trigger that causes the future is the time itself (e.g. at midnight calculate if maintenances get due today) I agree that all the handling in the domain should be event/command driven. But an event is usually defined as something that happened.
1) Depends on how far out you delay delivery. But it's about versioning strategies at its core to your point. 2) There is no command trigger, eg midnight calculate. 3) Delayed delivery doesn't have to apply only to events. You could delay delivery for a command. But a timeout is an event.
@@CodeOpinion 1) Yes see you just need to introduce another new concept. 2) If something (like my example) is defined to happen at midnight it has to happen at midnight (a new day started). A day transition is happening in the real world. You can't just discuss this away.
@@pwn2own23 Another concept? If you're using any type of message or event driven architecture, you have to think about versioning. Is delayed delivery a solution to all life's problems, no. If you need to run a batch job at midnight, then go ahead. The point of the video is that there are many situations where you don't actually need them and there are other solutions like delayed delivery. Should you only use batch jobs? No. Should you only use delayed delivery? no. Be pragmatic.
@@CodeOpinion "be pragmatic" that's basically what I said in my very first sentence :) I wasn't arguing pro batch everywhere. I just wanted to show some disadvantages of the proposed approach, because I missed a discussion about them in the video. Your video definitely showed me something new, I'll add to my toolbox and think about in the future.
If handling changes, you need to decide if you are going to support both strategies at the same time in the code with versioned messages and handlers, or if you will just always use the latest strategy, potentially creating an adapter handler that "converts" whatever the old message is to the new one. The adapter and old handler implementation could live in the system while there are still old messages laying around, and then decommissioned.
In our company, we also use SQS but needed to delay message for 24h. Our solution was to create a sql based queue mechanism with a worker which checks table every 15 seconds. When we wanted to schedule a job in 24h from now we simply put a new row to the job table and the worker would then send it to SQS on requested time. It is not a silver bullet solution but works great for us.
@@CodeOpinion well, that beats the purpose of your video I guess. Dealing with failures and state in the dequeue and requeuing… maybe easier and better to use jobs
Great video. One observation is that in order to send the message reliably to the queue, you'd want to use a transactional outbox pattern. This basically means you have a batch job to process enqueing outgoing messages from the database to the queue. I guess this batch process is "special" in that sense and not one that can be subject to replacement :-)
I have an upcoming project at work that this covers beautifully. I'll be using this video to pitch a delayed delivery approach to our implementation! Thanks for the awesome content!
That feeling when you see a 1 hour code opinion video 🙏🙏
I am curious about how this will affect message broker. Let's say we have 1M orders per day, it means that we have 1M message with delay for a week per day and most of those messages will do nothing at the end because I expect that 99% of the orders will finish successfully.
Compare to this, batch job will select only orders which really didn't finish with success in defined time range.
In presentation there is a chart where we can see that we removed some load done by batch job but I would also like to see how did we affect message broker by sending so many messages with delay which are quite useless.
It looks to me that this delay is more useful in cases where we need to ensure that some event on the order must be done almost exactly after e.g. 15 minutes since order was reserved.
Be pragmatic. Delayed delivery isn't going to be appropriate for every situation. But to your example of 1M orders/day, you'd likely be dealing with a pretty large scale and you wouldn't be using a single database or a single broker. You'd likely be partition to some degree. While your example is valid, it comes with other concerns.
While many prefer to run long running batch jobs during non-production hours, I always keep my batch jobs simple and create many jobs to do a single task as much possible. Doing this way, based on business requirements, I define status in database and process them using cron jobs that are configured from 1 minute interval to a day or week or whatever business case demands.
The only thing missing is concurrency at great numbers, but your video to cut that off to run through the day anyway.
While it is a convenient thing to run cron jobs across the day with applications that are done based upon MVC architecture, where we choose to use one or more databases as needed, I am thinking about how we will use rabbitmq, to see if the order reservation expiry time is reached or not, unless a consumer is configured to
1) pick an item from the queue
2) lock it, so other consumers don't pick at the same time
3) check if expiry time is reached!!! and item is not picked up by the customer
3a) if yes, expire that order item and post the event accordingly
3b) if no, put the item back into the queue, for another check by same/another consumer in a future time
Is this thought process correct?
I listened some podcast and similar topic was discussed. The reasoning for choosing batch jobs would be possibly more cpu efficient process or ability to use data lakes or data mining.
As always, no solution is a silver bullet. There are scenarios where batch jobs are not practical. As with some of my examples, using various queue priorities and the number of consumers can be away to scale. I totally forgot to mention downstream services and how they are affected by batch jobs.
AWS Step functions and their wait functionality are great for this sort of stuff. In Azure you can use Durable functions.
As someone who has used both for this exact purpose, I must say Durable Functions are infinitely superior to Step Functions in almost every way. It's amazing how you can create sagas and workflows using "standard" .Net constructs vs the bloat and unfriendliness of something like Step functions.
Awesome, this video was perfect in terms of timing. Right now, we are researching a solution to get rid of our batch jobs that are continuously hitting our database with complex queries. However, I just have a question regarding with testing, if the deferred message is a little bit longer, let's say its 7 days ahead from now, and we wanted it to be triggered immediately for testing purposes, how would you handle that scenario? I wonder what kind of testing strategies is needed in order to tests deferred messages. TIA
This would depend on what you're using as a message transport and if you're using a messaging library, which I recommend. A messaging library that would be abstracting the underlying transport would often provide a means to test this. If you don't then you'd probably have to resort to having different configuration for the delay while testing.
That was a great presentation. Something else that I feel you could have brought up as well would have been secondary delays: pushing off the notification for a few minutes when the system utilization is above some threshold (but limiting the secondary delays to some maximum so that they do not get delayed forever). This would allow for some true smoothing of that peak, if the business case permits it.
For example, maybe 10am seven days from when the original action occurred is actually peak time, so when the future event is consumed, it checks the cached server stats, and (if the CPU/memory utilization is above 75%, for example) it pushes out another delayed event for some short time later (along with a delay counter or max delay date/time).
Sure. Albeit a bit of an optimization should you need it can also use lower priority queues for that purpose as well.
@@CodeOpinion That's a good point; why consume events when you're currently at peak.
Thanks for the well-thought-out content. Have a great day.
Scheduled messages on service bus was a nightmare for me.
Difficult to change the schema of scheduled messages, you need to write extra code for backward compatibility. Also difficult to change the schedule time of messages. It's unnecessary complexity for me compared to batch jobs.
Yes, versioning messages in general can be a challenge. I'll try and cover this in a future video. Appreciate the comment!
Thanks for the video.
I plan to use this pattern on a platform in the form of timeouts or reminders, but in our case, we need to change or remove them. In my opinion this could be required in many different scenarios. How does NServiceBus handle it?
You generally aren't removing a a message with delayed delivery however you can set an ‘Time To Be Received’ , in which after that time elapses, you don't care about processing that message.
Nservicebus is super expensive, but clearly a good tool and you use it a lot. Do you have any videos on tooling for message / async solutions?
I don't, but it's probably a good idea for a topic to cover messaging libraries.
isn’t this essentially just running the cron job every minute looking for messages instead on running it at greater intervals. I’ve used hangfire for long running processes which seems could be utilized for this.
If nservicebus is out of question, masstransit supports delayed delivery (depending on the transport queue used) and particularly interesting has sagas, which are essentially designed as state machines. Super easy to implement scenarios like expirations that way.
@@nitrovent I have used it for a while, great tool, but find its clunky to setup. Also, my eco system has multiple consumers / producers which I don’t control so keeping it simple is a must. Client rather use nservicebus which has support and documentation
As always: it depends. I see some major problems with this approach:
- What happens to future events if the domain logic, that cause or handle these future events, changes? I don't see a simple solution for this problem.
- The spikes will stay if the command trigger that causes the future is the time itself (e.g. at midnight calculate if maintenances get due today)
I agree that all the handling in the domain should be event/command driven. But an event is usually defined as something that happened.
1) Depends on how far out you delay delivery. But it's about versioning strategies at its core to your point.
2) There is no command trigger, eg midnight calculate.
3) Delayed delivery doesn't have to apply only to events. You could delay delivery for a command. But a timeout is an event.
@@CodeOpinion
1) Yes see you just need to introduce another new concept.
2) If something (like my example) is defined to happen at midnight it has to happen at midnight (a new day started). A day transition is happening in the real world. You can't just discuss this away.
@@pwn2own23 Another concept? If you're using any type of message or event driven architecture, you have to think about versioning. Is delayed delivery a solution to all life's problems, no. If you need to run a batch job at midnight, then go ahead. The point of the video is that there are many situations where you don't actually need them and there are other solutions like delayed delivery. Should you only use batch jobs? No. Should you only use delayed delivery? no. Be pragmatic.
@@CodeOpinion "be pragmatic" that's basically what I said in my very first sentence :) I wasn't arguing pro batch everywhere. I just wanted to show some disadvantages of the proposed approach, because I missed a discussion about them in the video. Your video definitely showed me something new, I'll add to my toolbox and think about in the future.
If handling changes, you need to decide if you are going to support both strategies at the same time in the code with versioned messages and handlers, or if you will just always use the latest strategy, potentially creating an adapter handler that "converts" whatever the old message is to the new one.
The adapter and old handler implementation could live in the system while there are still old messages laying around, and then decommissioned.
Awesome video thanks
Delayed sqs is max 15m, not sure how you would implement future messages (1week/ hours)
Dequeue and requeue. Obviously cost implications.
In our company, we also use SQS but needed to delay message for 24h. Our solution was to create a sql based queue mechanism with a worker which checks table every 15 seconds. When we wanted to schedule a job in 24h from now we simply put a new row to the job table and the worker would then send it to SQS on requested time. It is not a silver bullet solution but works great for us.
AWS Step functions as well. But also expensive…
@@CodeOpinion well, that beats the purpose of your video I guess. Dealing with failures and state in the dequeue and requeuing… maybe easier and better to use jobs
I don't think so at all. If your using a messaging library, this is abstracted from you.
TLDR
You can model the lifecycle of a business process that includes events in the future using timeouts.