I Scaled My Transactional Outbox to 2B+ messages/day. Here's how

แชร์
ฝัง
  • เผยแพร่เมื่อ 3 ธ.ค. 2024

ความคิดเห็น • 58

  • @MilanJovanovicTech
    @MilanJovanovicTech  4 วันที่ผ่านมา +4

    Get the source code for this video for FREE → the-dotnet-weekly.kit.com/outbox-scaling
    P.S. In the "Enabling RabbitMQ batch publish" section: ConfigureBatchPublish was deprecated with MassTransit.RabbitMQ v8.3.2. This version uses the new RabbitMQ.Client v7, which was rewritten to use the TPL and async/await. You will see a similar performance improvement just from upgrading to this version.

  • @GabrielRibeiro-of5mn
    @GabrielRibeiro-of5mn 4 วันที่ผ่านมา +33

    Currently, you are the best content creator for .NET. Your content perfectly addresses our daily needs and the challenges we face in the projects we're working on. Thank you so much for your dedication to delivering extremely high-quality content. I hope you never stop teaching us.

    • @dy0mber847
      @dy0mber847 4 วันที่ผ่านมา +3

      100% agrees. Insted of doing clickbait shit Milan delivering solutions and tools for wide range of problems .NET devs face

    • @MilanJovanovicTech
      @MilanJovanovicTech  3 วันที่ผ่านมา +2

      Wow, thank you! And challenge accepted for future content 😁💪

  • @eddypartey1075
    @eddypartey1075 4 วันที่ผ่านมา +7

    High tier content, much appretiated!

  • @chanakachathuarngaathapath7282
    @chanakachathuarngaathapath7282 4 วันที่ผ่านมา +2

    Thank you so much, Milan! This is fantastic content. I'm currently working on a similar implementation, so this has been incredibly helpful

  • @margosdesarian
    @margosdesarian 16 ชั่วโมงที่ผ่านมา +1

    Milan i find it hard to keep up - is there any way you could slow down a little bit? It would improve the overall experience i am sure. Its a really great video btw ! I would like to see more videos that show complicated stuff that uses best enterprise practice - like this one.

    • @MilanJovanovicTech
      @MilanJovanovicTech  5 ชั่วโมงที่ผ่านมา

      Just my style of explaining 🤷‍♂️

  • @namtrg
    @namtrg 3 วันที่ผ่านมา

    U r awesome as always, keep it up Milan !!

  • @Great_Critic
    @Great_Critic 3 วันที่ผ่านมา

    Awesome video. We would definitely like to see more examples of such brilliant architectural decisions. Especially the ones you've implemented on real projects where that especially matters. Thank you!)
    By the way, to what degree we can increase number of messages queried and number of parallelism? Should we refer to CPU load?

    • @MilanJovanovicTech
      @MilanJovanovicTech  2 วันที่ผ่านมา

      Yep, always refer to the available resources. The optimized outbox on a single worker may be enough for most applications.

  • @robertogonzalorodriguez845
    @robertogonzalorodriguez845 4 วันที่ผ่านมา

    Awesome! I definitely want more videos like this.
    I have a question regarding to the index created in 10:52. Perhaps it is a micro-optimization but I think that you could exclude the processed_on_utc field because having that the WHERE clause, there is no need for this field to be included. Thanks again for these grate tips.

  • @hjalmarhengstmann7686
    @hjalmarhengstmann7686 4 วันที่ผ่านมา +1

    Great set of tips Milan. Thank you for sharing. Please come over to Bluesky. Very vibrant and growing tech community there 😉

    • @MilanJovanovicTech
      @MilanJovanovicTech  3 วันที่ผ่านมา +1

      I doubt I can handle one more social media

    • @hjalmarhengstmann7686
      @hjalmarhengstmann7686 3 วันที่ผ่านมา

      @@MilanJovanovicTech I can relate & sorry for injecting this comment unrelated to your video. I just keep on following you here and remain a Patreon supporter. Thanks again for your great content

  • @TheBekker_
    @TheBekker_ 4 วันที่ผ่านมา

    This was very useful and insightful, thank you :)

  • @AlwaysHCYT2
    @AlwaysHCYT2 2 วันที่ผ่านมา +1

    The set of messages to read and update is fixed, so the db can just load the index on memory and serve the data require really fast. but in a real system the db will be overwhelmed updating the table and the index and the retrieval will be quite slower than in the "static" example.
    indexes are really good when the number of reads is far greater than the writes. when it is about 50/50... you can get unexpected results

    • @MilanJovanovicTech
      @MilanJovanovicTech  2 วันที่ผ่านมา

      Well it's not so static - 5 workers query and update the table at the same time (each update also updates the index).

    • @daveanderson8348
      @daveanderson8348 วันที่ผ่านมา

      ​@@MilanJovanovicTech Perhaps he means that no new records are added to the table while you are reading and updating it.

    • @AlwaysHCYT2
      @AlwaysHCYT2 วันที่ผ่านมา

      @@daveanderson8348 @MilanJovanovicTech yes, the index is build over occurred_on_utc, processed_on_utc. the update phase changes only the first field. in a real system where multiple inserts in the table happen concurrently together with selects and updates the index will become the bottleneck because of fragmentation and/or because it will become unbalanced.
      I'd like to see the same benchmark executed while another background process fills the table with tons of rows.

    • @AlwaysHCYT2
      @AlwaysHCYT2 ชั่วโมงที่ผ่านมา

      yes, ​ @MilanJovanovicTech. The db doesn't need to modify the index structure but only the a value inside the index itself.
      I'd like to see a benchmark where there is a background service that and 2 or more millions of row while the existing one reads and update the same table

  • @abomalek8
    @abomalek8 4 วันที่ผ่านมา

    Thanks for sharing this.
    I've a question at 1:02 , I wonder why if we have an exception for example during publishing the outbox message to the RabbitMQ then I beleive to retry it agin later , However yo said : we won't be processing it any more.

    • @MilanJovanovicTech
      @MilanJovanovicTech  3 วันที่ผ่านมา

      In that request it failed. Yes, we can retry publishing it there directly but what should we do when retries fail?

  • @tjagusz
    @tjagusz 6 ชั่วโมงที่ผ่านมา

    Awesome content. How does custom outbox in Dapper compares to Entity Framework (the build-in into Mass Transit). I often get database problems with it.
    Maybe some performance tips on Outbox using EF in next video? 😊

    • @MilanJovanovicTech
      @MilanJovanovicTech  5 ชั่วโมงที่ผ่านมา

      Will have to spend some time exploring the MT Outbox first

  • @nove1398
    @nove1398 4 วันที่ผ่านมา

    I was applying some of these already but i picked up a few tips to enhance my process even further🎉

  • @guilhermeloyola
    @guilhermeloyola 13 ชั่วโมงที่ผ่านมา

    Hey, Milan, thank you for creating this type of content! I have a question regarding batch size in an outbox pattern. I'm concerned about concurrency issues. If I have multiple workers running simultaneously, could this cause duplicate items in the database when they process the same batch of outbox messages? How would you recommend handling this scenario?

    • @MilanJovanovicTech
      @MilanJovanovicTech  5 ชั่วโมงที่ผ่านมา

      No - FOR UPDATE SKIP LOCKED solves that

  • @Maxim.Shiryaev
    @Maxim.Shiryaev 4 วันที่ผ่านมา +1

    And if you turn off publishing confirmation, then you can probably publish and update the database in parallel.

  • @UmerFarooq-w1x
    @UmerFarooq-w1x 3 วันที่ผ่านมา

    Can you please create a book with all these topics, i have gone through a lot of material but find your videos and way of explanation filled with valuable and concise information. Thanks for all you do 🙏

    • @MilanJovanovicTech
      @MilanJovanovicTech  3 วันที่ผ่านมา

      I'm not planning on writing a book any time soon

  • @mymemoryleaks
    @mymemoryleaks 4 วันที่ผ่านมา

    Thank Milan. Do you think TPL Dataflow may have helped increasing performance even more?

  • @deceptionsinner2875
    @deceptionsinner2875 2 วันที่ผ่านมา

    Very helpful

  • @peterk4694
    @peterk4694 3 วันที่ผ่านมา

    Is there a good reason not to use IASYNCENUMERABLE to process the outbox query?

    • @MilanJovanovicTech
      @MilanJovanovicTech  2 วันที่ผ่านมา

      Which query exactly? The one returned by Dapper?

  • @haraheiquedossantos4283
    @haraheiquedossantos4283 3 วันที่ผ่านมา +1

    Good video. Congrats.
    What about showing processing those millions and billions of messages on the other side of consumption? 😂

    • @MilanJovanovicTech
      @MilanJovanovicTech  2 วันที่ผ่านมา +1

      My machine would crash 😅 Typically, the consumers would be a separate set of servers

    • @haraheiquedossantos4283
      @haraheiquedossantos4283 2 วันที่ผ่านมา

      @@MilanJovanovicTechfair enough 😂😂😂

  • @sunnypatel1045
    @sunnypatel1045 2 วันที่ผ่านมา

    Would it be possible to show an outbox in memory that mass transit uses

  • @MegaMage79
    @MegaMage79 4 วันที่ผ่านมา

    you could optimize DB update even further by using some clever UNNEST tricks with array parameters so that the DB doesn't have to parse the huge string

    • @MilanJovanovicTech
      @MilanJovanovicTech  3 วันที่ผ่านมา

      My SQL skills only go so far 😅

    • @janjoska2549
      @janjoska2549 2 วันที่ผ่านมา

      Would using a stored procedure save some time by caching execution plan?

  • @iliyan-kulishev
    @iliyan-kulishev 3 วันที่ผ่านมา +1

    Doesn't the order of the messages matter ? I'm surprised we can afford processing them in parallel.

    • @MilanJovanovicTech
      @MilanJovanovicTech  3 วันที่ผ่านมา

      Message ordering is never guaranteed anyhow - with most brokers. There are some that support FIFO queues.
      You can solve this on the consumer by buffering incoming messages, and then processing them in order. And even this is suspectable to problems. One retry on the producer, and you may get out of order messages.
      That's why it's best not to depend on in-order processing. And if you do need in-order processing, than you'll probably want to model that as a Saga or similar.

  • @sunzhang-d9v
    @sunzhang-d9v 2 วันที่ผ่านมา

    The average consumer should not be able to process this data, how to create a consumer base to process this data, what are the recommendations of the database, etc

  • @phw1009
    @phw1009 วันที่ผ่านมา

    Murderer!! You are kiling it(outbox)!