Event Sourcing do's and don'ts

แชร์
ฝัง
  • เผยแพร่เมื่อ 18 ต.ค. 2024

ความคิดเห็น • 60

  • @AlexandreCassagne
    @AlexandreCassagne 2 ปีที่แล้ว +42

    You should really create a course. Like maybe a few hours of theory, then a project involving several micro services with event sourcing. I’d make my company pay for a seat for me anyway…

  • @leopoldodonnell1979
    @leopoldodonnell1979 2 ปีที่แล้ว +3

    Derek, you nailed it - these are the issues that we see each time a new team picks up Event Sourcing. Great summary!

  • @RM-bg5cd
    @RM-bg5cd 2 ปีที่แล้ว +1

    I loved the name for integrated events. We use domain events for both porpuses, even w/o event sourcing, and we actually had meetings where we were trying to find a good name for them. Service events were one of the candidates. Integration events are great too, I'll remember this.

  • @staan.b
    @staan.b ปีที่แล้ว

    I love this channel, thanks for being here!

  • @cristianpallares3847
    @cristianpallares3847 2 ปีที่แล้ว +4

    CodeOpinion is my favourite Event Sourcing series 😍

  • @lolop2576
    @lolop2576 2 ปีที่แล้ว +1

    Hey Derek,
    I have a use case at my work where I think event sourcing makes a lot of sense - Purchase Orders. Basically people can create purchase orders and we want a list of all the transactions that have been applied against that Purchase Order.
    The problem I have is it feels like there are 2 parts, the first is when they are just creating the Purchase Order and all it's lines (crud), then the next is when they are processing different actions on the PO such as receipting/invoicing (Event Sourcing).
    Our application is a 'generic' solution for customers, and if we made every field require an action while creating it our customers would be up in arms (There are ~10-20 fields per PO line and ~20-30 fields on the PO itself). Once the PO is finalized this is no longer a problem. Is it possible to make a system that is crud in natures until a certain point then cross over into using event sourcing later (E.G once the PO is finalized)?
    I don't really like the idea of that, so I'd lean towards doing the 'PurchaseOrderUpdated' style events but I also don't like that idea for the reasons you've said. Potentially we have a design issue, but I'm not really sure about how we could change it.
    Just wondering if maybe this use cause isn't great for Event Sourcing or if you have any ideas to manage/design this.

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว +1

      Being in a type of "Draft" mode I think is pretty common before workflow really gets under way. Mainly because you're not in a valid state. I believe I mention that in this video: th-cam.com/video/JZetlRXdYeI/w-d-xo.html

    • @lolop2576
      @lolop2576 2 ปีที่แล้ว

      ​@@CodeOpinion I think I had watched that video before but it didn't really click for me but I think it has now, thank you!
      I do think it would be cool to see a video actually showing how you handle/manage that transition in a bit more detail code wise if you don't already have that planned.

  • @ZachBugay
    @ZachBugay 2 ปีที่แล้ว

    Incredible video. Thank you so much for sharing this.

  • @havokgames8297
    @havokgames8297 7 หลายเดือนก่อน

    In your CRUD events example, how would you represent the remaining CRUD fields: name and description. Would these be okay being "ProductDetailsUpdated" for example? As opposed to needing to come up with a business scenario where those would be changed.

  • @tmati7860
    @tmati7860 10 หลายเดือนก่อน

    Thanks for your awesome tutorial. I'm new to Event Sourcing. Is it a best practice to use Event Sourcing for Post's likes/unlikes in a social media website? I mean considering performance, cocurrency and Data integrity.

    • @CodeOpinion
      @CodeOpinion  10 หลายเดือนก่อน

      You often want to build an event stream so you can re-play all the events to get to current state. Do you need to replay all the current events for "like" or do you simply want a log with a current like count?

    • @tmati7860
      @tmati7860 10 หลายเดือนก่อน

      @@CodeOpinion I'm not sure whether I need to re-play all events or not. Honestly I'm new to Event Sourcing and I'm looking for the best practice to implement like/unlike feature considering performance, cocurrency and Data integrity.

  • @axelbrinck_
    @axelbrinck_ 2 ปีที่แล้ว +1

    Great videos! I have a questions: somebody knows how to approach (for example) GetAllCustomers? Do I need to iterate all the streams in my EventStore, and check if they are a Customer? This sounds super innefficient for me... is there another way? Thanks!

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว +1

      Projections. Check out this video: th-cam.com/video/bTRjO6JK4Ws/w-d-xo.html

    • @axelbrinck_
      @axelbrinck_ 2 ปีที่แล้ว

      @@CodeOpinion Thank you!!

  • @jaiderariza1292
    @jaiderariza1292 2 ปีที่แล้ว +1

    I have a question about implementing this IF we need to do some external transactions, like bank widthdraw, sending email, etc. these are event that should be execute once, so I wonder how people doing this?, maybe is not part of event? or is checking flag or something to verify is the first time?

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว

      Example of sending an email would likely be a consumer of that event or an integration event depending on where that email consumer resides.

  • @ernest1520
    @ernest1520 2 ปีที่แล้ว

    While the last point makes sense in some cases, from my experience I've found that using the same domain events for consumption by other services can and indeed often is a more pragmatic approach. It's a little bit like with snapshots example- i.e. integration events are useful, but use them when you actually need them. To be clear- I mean consuming those events from different services that contribute to the same broader system. Integration events should definitely be used for communication between different systems.

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว +1

      I do think there are situations where exposing domain events to other boundaries can happen, mainly when they pretty stable business concepts. This is because they aren't likely to change. I created a video about this: th-cam.com/video/53GsiAcKm9k/w-d-xo.html

  • @TomasJansson
    @TomasJansson 2 ปีที่แล้ว

    Another great video. Is "crud-sourcing" that bad? My point of view it that it can be fine if you are building up your "reference data" with it. Everything that is the "process" of the application definitely need to have more business specific events. Concrete example, if I have a system that a user interacts with and the what happens during the interaction is based on a set of rules. When you create the rules that are used in the interaction I think "crud-sourcing" is a valid option, but the actual user interactions should be using proper event sourcing.

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว

      I wouldn't call it "bad". What I'm more referring to is that being the sole and primary way of thinking rather than focusing on actual business concepts. That's where the value lies IMO.

    • @TomasJansson
      @TomasJansson 2 ปีที่แล้ว

      @@CodeOpinion can’t argue there, but it is also important to identify what events and data where that really matters. For “reference data” having proper events might not be important since sometimes there aren’t any significant business concepts associated with it.
      Another thing I’ve noticed is that sometimes it actually makes sense to mix some crud like events with business like events for that type of data.
      Again, great video as most of them are!

  • @tony-ma
    @tony-ma 2 ปีที่แล้ว +1

    Hi Derek, if I build a social media application, and use event store for the post in the social app. If each comment to the post will be a events to post, in the case there might be hundreds and thousands of the comments. Should I considering using the snapshot? And when people reply on a comment, I will need to check if the comment ID is valid but containing all the comment IDs in the final state of the aggrigate for the post event stream. By doing this, the snapshot will be way too big if there are just too many reply on one post. How to resolve this?

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว +1

      Not entirely sure this is a good use-case. I'm not sure I see any value does recording comments as events to an event stream. Check out this video where I talk about event streams and being guided by invariants: th-cam.com/video/C5TmLMZ4fXo/w-d-xo.html

    • @tony-ma
      @tony-ma 2 ปีที่แล้ว

      @@CodeOpinion thanks for your reply. Here is my use case: when a use reply a comment, we need to check if the comment is valid (e.g. a user should not reply to a comment which is deleted or does not exist). Therefore we need capture the comment event in the steam and get the valid comment IDs in the aggregate.

  • @ZachBugay
    @ZachBugay 2 ปีที่แล้ว

    Do you have another video that shows how we could perform changes on records in an event store? For an example, lets say Quantity (in your video) changes from an integer to a float ( for some reason. )
    How would you handle that exactly?

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว +1

      I don't have a video for that. But the gist would be about how to handle versioning. I'll try and cover it in a future video.

  • @RaveKev
    @RaveKev ปีที่แล้ว

    Do you have a Stream for each Product? Or is it the ProductsStream and all different Product Aggregates are stored there?

    • @CodeOpinion
      @CodeOpinion  ปีที่แล้ว +1

      Not sure you're talking about this video or another one? But in event sourcing, each individual product would likely have it's own stream. That's making some assumptions on what you're really asking which might be related to another video I posted about inventory.

    • @RaveKev
      @RaveKev ปีที่แล้ว

      @@CodeOpinion thank you, I use Axon Framework in some private applications and only have one stream where I put all my Aggregates, events and commands. Never thought about creating a stream for each aggregate.

  • @thedacian123
    @thedacian123 2 ปีที่แล้ว

    Hello.So an event store db,as far as i understood is not only a database but message broker ,because i can subscribe on it,right?

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว +1

      It has different subscription models you can use, yes. However I'd keep those subscriptions to consumers within the same logical boundary.

  • @robwatson826
    @robwatson826 2 ปีที่แล้ว

    Very interesting as always.
    I think I've got a gap in my knowledge somewhere - in your optimisation example using stock adjustments, what if this product was being sold and adjusted hundreds of times per day? or the entire system was responsible for 100,000 products all having stock adjustments daily. How do you maintain the speed of "recalculating" the current stock level from the stream?

    • @adambickford8720
      @adambickford8720 2 ปีที่แล้ว

      This is where "partitioning" can come into play. It may well be that 1 node can handle hundreds/thousands of accounts on average, but that 1 celebrity account alone takes 10 nodes. If there was a simple answer, nobody would be looking to hire you :)
      Think that sounds tough? Wait until you have to do that while mitigating the damage from the previous wrong guess!

    • @everydreamai
      @everydreamai 2 ปีที่แล้ว

      The point is exactly for high throughput systems because your write and read models need to be separate to keep up with data flow. Just overwriting a single value in any store doesn't scale because you're running into physical limits on overwriting a single value. To do this, Derek shows an implementation of CQRS which is shown video as the separate DocumentDB "read" store where any other serivce external to this domain would read from. You wouldn't have another service (Order service, Reporting service, etc) read directly from that event stream, it is only written to internally to the domain boundary (the "service") by its own consumers and you don't want to expose the broker like that. That boundary is shown by the dotted line in the diagram.
      The "Read" store is only written to periodically, not with every update. So that is your first bottleneck alleviated by using CQRS. You might update every 1 second or %n messages, or look at the offsets of the incoming message and system time to approximate flow rate , or just let the API of the "read" store limit the flow, etc.
      The broker can keep up because it is basically an append-only DB. Logically the command is updating a single value which is bottlenecking if left to a single physical store. Appends can scale by spreading writes across physical disk and computers, it's easy enough to load balance out all those writes to N-disk or N-machines with how a broker works, but you don't need to worry too much about that since that's the broker's job and you don't need to write that logic, only configure it, like give it a certain number of physical nodes and disks, or you can use a SaaS broker like Azure EventHub or AWS SQS.
      Your read store isn't going to update every time the value changes in the incoming stream, and that's why it is "eventual consistency" and not like an ACID CRUD operation with LOCK to update the value.
      At some point the consumers must scale horizontally too, the next bottleneck, but that's probably enough for now. I think given your statement, different "products" could simply be different topics on the broker, so the consumers are independent of one another physically as the messages flow, and so adding more products has no negative impact on scaling. Each product has its own partition in DocumentDB as well, so they are separate physical values to be written to, using the same scheme described above. This would scale to the moon simply by making sure your broker has enough hardware and network to keep up with total message flow. The proper type of broker for such a thing, like Kafka, have subscription and write flows that keep reads and write independent of one another, again everything is write by append, etc. so you can simply add nodes to scale as you increase total message rates. Derek has more videos on how a lot of the above work.
      Hopefully that helps.

    • @robwatson826
      @robwatson826 2 ปีที่แล้ว

      @@everydreamai That's a fantastic answer, thank you very much. All the best.

  • @pierre-antoineguillaume98
    @pierre-antoineguillaume98 ปีที่แล้ว

    How do you handle GDPR requests when even sourcing ?
    I mean I wouldn't event-source every aggregate so maybe some identity is not what one should event-source and that's a non-issue

    • @gerardklijs9277
      @gerardklijs9277 ปีที่แล้ว

      You can encrypt any field containing personal data. And throw away the key when it needs to be forgotten.

  • @PiesekLeszek90
    @PiesekLeszek90 2 ปีที่แล้ว +1

    I like the video, but I have a question about snapshots. In the video you say most of the time there's no reason to make them, but i'm not sure if there's any reason to not make them. For me waiting with optimization until your api gets too slow is a bad practice. Beside that, if app is big enough there will be problem with performance even with otherwise small streams, eg. displaying amount of every product on a big list for ux reasons. If you get enough users doing that at once across different instances, it might be better trade to use more storage for snapshots rather than compute power to calculate the states, since snapshot is in a way a cache. And even beside that, by making snapshots you give up space to get compute power back, in a system with event sourcing you clearly don't worry about storage space so i don't see a reason to not use snapshots.

    • @robwatson826
      @robwatson826 2 ปีที่แล้ว

      I had a similar question after watching this. Although premature optimisation is usually unnecessary unless you really know you're going to need it, I do feel like automatically adding snapshots into the system makes a lot of sense.

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว

      "For me waiting with optimization until your api gets too slow is a bad practice." If you create snapshots from the start, at what point do you snapshot? 10 events? 100 events? 500 events? If you have metrics around how long it takes to rebuild state from an event stream, and you define some type of SLA on that, let's say 100ms, then once you start reaching that point, implement snapshotting on that instance/aggregate. Each event stream will have a different threshold based on the size of the events and actual compute to handle them when rebuilding state. There isn't a once size fits all. It's on a stream per stream basis. As a different way of looking at it. If you have a relational database and define indexes on a set of tables. Those indexes are only relevant/useful depending on the queries you perform and the amount of data. Smaller amounts of data will make the indexes have a different impact than larger amounts of data. In other words, you don't really know what indexes you need until you write queries. As data grows, those indexes need to be revisited. The same applies with event streams and snapshots. Lastly, the point I was making in the video is that snapshots aren't the be all end all and there's other ways to approach this. First as mentioned is keep streams small. It's about modelling them this way.

    • @PiesekLeszek90
      @PiesekLeszek90 2 ปีที่แล้ว

      I do agree you have to look at each stream individually, there's obviously little to no benefit from creating snapshot on stream that will end after 10 events. It makes much more sense to look at time it takes to rebuild state from event stream and only implement snapshots if they'll have measurable impact.
      I guess I just misunderstood the video, you made it sound like snapshots are almost a last resort and we should avoid them, and without really explaining what's so bad about them. Yes, they may not be needed a lot of the time and we should definitely remember to keep streams optimized themselves (rather than "oh whatever, the snapshots will keep it fast" approach). I just don't see them having any downsides, so they should be more like a thought out decision and not a backup plan.

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว +1

      The downsides are that it's an optimization that has many implications. It's a point in time representation (projection) used generally to enforce invariants when trying to perform a action/command. As an example, if your internal projection (which is your snapshot) needs to change now the existing snapshots are useless. My advice to snapshots are the same to caching, don't unless you have to. It's not a backup plan, it's an optimization that has complexity.
      Side note, here's a video I did about the complexity of caching: th-cam.com/video/Cx8ONrUq4a4/w-d-xo.html

  • @MiningForPies
    @MiningForPies 2 ปีที่แล้ว

    Issue we have a work is we have events stored. These are then used as part of large aggregation queries.
    So we end up reading millions of rows to return hundreds 😢

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว

      Why not create projections?

    • @MiningForPies
      @MiningForPies 2 ปีที่แล้ว

      @@CodeOpinion spaghetti code and hundreds of transaction scripts.
      The new system will have projections

  • @TheCypriot09
    @TheCypriot09 2 ปีที่แล้ว

    Can you make a video for event driven with event sourcing!? Thank you for the great video! :)

  • @gdargdar91
    @gdargdar91 ปีที่แล้ว

    Fully business-based events are only possible when your software creates facts and not the other way around (for example MS Azure, AWS, etc...). This is actually a fallacy, where many experienced developers fall for. For example PriceIncreased event, that you mentioned in your video would still not be a valid event, because the software isn't the authority that decides facts, but it only records them. Events should be modeled from a view of a software-as-a-ledger and not anything else that has the authority to declare facts as real.

    • @CodeOpinion
      @CodeOpinion  ปีที่แล้ว

      Interesting point of view, however If we're talking about event sourcing, your events are the facts for *state*. While they generally are business-based, they might not always be. Meaning theres a difference between events for state persistence and business events used for workflow or communication. Often times events for event sourcing can be too fine grain. I did a video on this: th-cam.com/video/dJBTNksQzys/w-d-xo.html

  • @mateusztocha9260
    @mateusztocha9260 2 ปีที่แล้ว

    What do you thing about dapr project ?

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว +1

      Unrelated to event sourcing, but the sidecar is interesting. There are a lot of building blocks (config/secrets, pub/sub, state management) when developing a system that's physically distributed. Having a sidecars that each individual service can use to have a consistent building blocks across all services. This is a good topic for a video.

  • @pierre-antoineduchateau33
    @pierre-antoineduchateau33 2 ปีที่แล้ว

    What are the risks of conflating Domain Events and Integration Events ?

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว

      This actually has a couple of different aspects. One is they have different purposes. Check out this video where I talk about the differences. th-cam.com/video/53GsiAcKm9k/w-d-xo.html Also, check out his video video where I talk about sharing your event stream between event sourcing and communication/integration between boundaries: th-cam.com/video/Y7ca1--EKsg/w-d-xo.html

  • @MerrionGT6
    @MerrionGT6 2 ปีที่แล้ว

    Intra-domain messages tend to be "fatter" (including some state info and context, for example) than events on the inside...because you really don't want logic in Domain B deriving state information from messages from Domain A as you couple their projections' code.

    • @CodeOpinion
      @CodeOpinion  2 ปีที่แล้ว

      Ya, good point. I didn't mention that really as I did have a separate video that talks more about this.