Event Sourcing You are doing it wrong by David Schmitz

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 พ.ย. 2018
  • Please subscribe to our TH-cam channel @ bit.ly/devoxx-youtube
    Like us on Facebook @ / devoxxcom
    Follow us on Twitter @ / devoxx
    "Every microservice get's its own database and then use Kafka" is a typical and naive advise, when reading about eventsourcing. If you approach this architectural style this way, you will probably have a really awful time ahead.
    Eventsourcing and CQRS are two very useful and popular patterns when dealing with data and microservices. We often find in our customer's projects, that both have a severe impact on your future options and the maintainability of your architecture. Presentations and articles on both topics are often superficial and do not tackle real world problems like security and compliance requirements.
    This combination of half-knowledge and technical confusion leads to many projects that either refactor back to a 'non-eventsourced' architecture or reduce eventsourcing to a message queue.
    In this talk, I will summarize our experience while applying eventsourcing and CQRS accros multiple large financial and insurance companies over the last 5 years. We will cover the Good, the Not so Good, and the 'oh my god...all abandon ships!' when doing eventsourcing in the real world...and see how we solved these issues.
    David Schmitz
    From Senacor Technologies
    Principal architect at Senacor Technologies with a history of + 16 years of working in various projects using a bunch of different stacks and environments.
    Current focus is on migrating architectures and organizations to cloud and serverless platforms.
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 85

  • @cadessi2263
    @cadessi2263 5 ปีที่แล้ว +4

    Amazing talk!

  • @DodaGarcia
    @DodaGarcia 3 ปีที่แล้ว +4

    9:25 "So, end of talk. Maybe not" 😂😂😂 I love how deadpan he is

    • @DodaGarcia
      @DodaGarcia 3 ปีที่แล้ว

      Also the footnote on the banking slide at 14:00 is appreciated, comrade David

  • @mohammadajafri6307
    @mohammadajafri6307 5 ปีที่แล้ว +1

    Really excellent talk

  • @tiagodagostini
    @tiagodagostini 4 ปีที่แล้ว +21

    Sorry but solving a 100 event stream in 66 ms is NOT fast. it is ABSURDLY slow when you figure the total expected data troughput of a modern computer. it would need to be 66 Microsseconds to be reasonably fast.

    • @illyam689
      @illyam689 4 ปีที่แล้ว

      agreed

    • @derNoaa270
      @derNoaa270 3 ปีที่แล้ว +1

      Also the reason why I instantly cringed when the presenter mentioned it being _fast_. But I guess one could argue if this kind of "speed" might still be enouhg for your usecase. But if you only have such little streams of data maybe you don't need to go for distributed systems afteral? Feels like everything is a dristributed system nowadays just for the sake of being a distributed system.

    • @amando96
      @amando96 3 ปีที่แล้ว +3

      @@derNoaa270 Yes! Half the people doing this would be fine with a monolith but it's like engineering departments have to justify their cost by making things more complicated than they have to be

    • @AlgoristHQ
      @AlgoristHQ 3 หลายเดือนก่อน

      ​@@amando96often times, the distributed system is created for the developer. If you can break software down into its domains and subdomains, it makes it easier to design for resiliency and maintainability.

  • @b7otato
    @b7otato ปีที่แล้ว

    Great talk

  • @LonliLokli
    @LonliLokli 4 ปีที่แล้ว +5

    I really like your talk! The only thing I miss is a discussion on after-party.

  • @charli3br0wn
    @charli3br0wn 5 ปีที่แล้ว +10

    Love this guy's sarcasm :)

  • @tyrotoxin
    @tyrotoxin 5 ปีที่แล้ว +6

    First part is about Partial Ordering of events which guarantees Causal Consistency model in distributed systems. Don't think that it comes for free though.
    The ability to specify an offset on emitting events to Kafka (aka "Lamport Clocks") creates a "lock" per partition what impacts performance as you have many concurrent producers. Besides, you need to emit events for the same Aggregate into the same partition - that reduces the Availability.
    Those are the right things to do though if you want to trade the Performance and Availability in favor of Consistency. And most likely you want to do so instead of putting that complexity into the application layer. Just be aware that the approach has its price.

  • @MykolaKlymyuk
    @MykolaKlymyuk 3 ปีที่แล้ว +1

    How do you solve 'Set validation' problem?

  • @MerrionGT6
    @MerrionGT6 5 ปีที่แล้ว +9

    One stream per aggregate is, of course, the way to go! Also - both logically and physically distinct.

  • @michaelb.2924
    @michaelb.2924 4 ปีที่แล้ว +1

    What are the trade-offs?

  • @nimrodsadeh9616
    @nimrodsadeh9616 2 ปีที่แล้ว

    Also, in the part about query aggregates you said to just put all the streams that correspond to your query in a bigger stream (e.g., users > 18 years old). This is a good example because it changes all the time; your users grow up and become 18. Wouldn't the logic to handle moving streams around like this become a bit cumbersome?

  •  3 ปีที่แล้ว +2

    Can you build an "event sourced" system iteratively?
    It's very seldom you actually know your domain well enough to specify it in advance, and even if you do, it will most likely change.
    That would be an interesting topic to discuss and compare between different architectures.

    • @KoflerDavid
      @KoflerDavid 2 ปีที่แล้ว +2

      Personally, I think that's even the correct approach to do it. There is no chance that you will get the system exactly right if you build it in one go. Therefore, you will have to deal with change anyways. The biggest problem is that you will have to live with old versions of events for essentially forever. You really have to think it through and practice a few times how to handle migrations in event-driven systems. This talk shows a few techniques to do so.

    • @VictorMartinez-zf6dt
      @VictorMartinez-zf6dt ปีที่แล้ว

      Yes, you can add events and commands to aggregates as you go. But I'd argue that you don't build event sourced system, but event driven systems. Event sourcing is a pattern of storing transactional data of single aggregates.

  • @igrai
    @igrai 5 ปีที่แล้ว +1

    What can I say - excellent talk!

  • @codematrix
    @codematrix 5 ปีที่แล้ว +8

    Here's my take. The current industry state of event sourcing is still not well defined and there seems to be too many flavors on what is right and what is wrong. Like all greenfield approaches to solve a problem, I tend not go into it so quickly until a consistent well defined approach becomes the leading defacto. My suggestion, don't build your own homegrown ES because it will be a hodgepodge colossal mess and a maintenance nightmare. I plan to wait until AWS or Azure, come with a viable solution and the let them worry about the implantation and maintenance while I worry about the meat of my microservices.

    • @illyam689
      @illyam689 4 ปีที่แล้ว

      I think we have to wait for a long time..

    • @codematrix
      @codematrix 3 ปีที่แล้ว +2

      @Andreas Berger - I agree with the lack of progress as a possibility. If a team were to tackle this endeavor, make sure it's a new, small to medium size project. I would never convert an existing project from non-ES to ES unless the architecture lends itself already to a clean architecture, hopefully with CQRS on the read side of the equation.

    • @javier.alvarez764
      @javier.alvarez764 2 ปีที่แล้ว

      True. For now CQRS and Event sourcing seems to be just a bunch of theories. No clear definition, and actual implementation.

  • @nimrodsadeh9616
    @nimrodsadeh9616 2 ปีที่แล้ว

    Aren't event handlers that read from persistent storage CPU intensive since they deserialize every event?

  • @rum81
    @rum81 4 ปีที่แล้ว +8

    whatever he says "super easy" i find it difficult to follow

  • @dmsanz_youtube
    @dmsanz_youtube 4 ปีที่แล้ว +7

    I liked it, but saying that read models is overrated could be very confusing especially considering that 80%-90% of the interactions with our web applications are "read" operations and also considering that one of the most powerful advantages of event sourcing with CQRS is the ability to replay the events in a stream in order to rebuild a read model (e.g: new geodistributed infinite cache) or in order to create a complete new one (e.g: reports).
    In my opinion read models are actually underrated and having a good event versioning and replaying strategy is crucial.

    • @lukasz_sarnacki
      @lukasz_sarnacki 3 ปีที่แล้ว +3

      I think you missed the point. He just said that you may not need to persist and instead dynamically fold event streams each time.

  • @RikiPoon
    @RikiPoon ปีที่แล้ว

    6:26 take away the eventstore and the whole thing still works. To me it seems to be depicting event-driven architecture rather than event sourcing, as we make no use of the eventstore at this picture

  • @DavidCumps
    @DavidCumps 5 ปีที่แล้ว +9

    How does GDPR deal with the requirement of storing data for fraud detection as a bank? I cannot image a customer can have his bank history forgotten to be hard deleted? Doesn't the government require an audit trail in case you are charged in court?

    • @dmsanz_youtube
      @dmsanz_youtube 4 ปีที่แล้ว +3

      Encrypt the personal identifiable info of the events and store the specific key for that subject in a separate storage. If required by the user, just soft-delete the key so that the info cannot be decrypted. If there is a government/court requiring the info, simply undo the soft delete and provide a good reason to do so. That's at least how I would do it. Whether you hard delete the info or not I think it's irrelevant for the GDPR as long as the personal info cannot be recovered. So the security to undo the soft-delete must be strong..

    • @bluejanis5317
      @bluejanis5317 4 ปีที่แล้ว +2

      @@dmsanz_youtube If you can undo a soft-delete, then you can recover the personal info. That is not allowed with the GDPR & DSGVO.

    • @RonArts
      @RonArts 4 ปีที่แล้ว +1

      @@bluejanis5317 There are often multiple laws at work. Financial data may need to be kept for 7 years for example, but other data may just 2. So you create multiple keys with each user, and encrypt financial data with the financial key etc. Then you can just delete the appropriate key.

    • @guibirow
      @guibirow 3 ปีที่แล้ว +1

      If the data is kept to comply with regulatory purposes, the company cannot delete the data, otherwise they would be infringing other laws.
      But they must keep only the data needed for regulatory requirements, any other data must be deleted

  • @placidchat7532
    @placidchat7532 5 ปีที่แล้ว

    If eventstore documentation is awful but the community is great, how would you choose this library vs every other one out there?

    • @jeffcrow1481
      @jeffcrow1481 4 ปีที่แล้ว

      They have support and offer for business

  • @krellin
    @krellin 2 ปีที่แล้ว +1

    the guy hates frameworks but uses eventstore which practically decides for you everything from what language you code in to how you do high availability...
    there are good points about version (or not doing versioning) though...

  • @jinhanchoi1
    @jinhanchoi1 4 ปีที่แล้ว +3

    one stream means a topic for one user?

    • @aneshas
      @aneshas 3 ปีที่แล้ว

      Yes, one topic / stream per aggregate "instance" not aggregate "type"

  • @bluejanis5317
    @bluejanis5317 4 ปีที่แล้ว +3

    What about commands? Your talk never mentioned them. CQRS = Command-Query-Responsibility-Segregation

    • @aneshas
      @aneshas 3 ปีที่แล้ว

      Hi, what do you want to know about commands in this context?

  • @devproco8063
    @devproco8063 3 ปีที่แล้ว +2

    Can you explain what `emitEvents` does?
    - does it _apply_ events to an aggregate?
    - does it save events generated by an aggregate (eg `executeBusinessLogic()`)?
    Is the `Account.lastEventNumber !== Stream.lastEventNumber` supposed to occur in the storage implementation? For example, inside of `Stream.append()`?

  • @Spiritusp
    @Spiritusp 5 ปีที่แล้ว +2

    37:30 Love the Monthy Python reference!

    • @SashaArsic
      @SashaArsic 3 ปีที่แล้ว

      My favorite colour :p

  • @boot-strapper
    @boot-strapper 3 ปีที่แล้ว +1

    Luckily we dont have to go to ops to ask for anything now because AWS offers managed kafka.

    • @TheApeMachine
      @TheApeMachine 3 ปีที่แล้ว

      Yeah, until the bill scales exponentially :p Or someone requires you to move to another cloud or on-prem... But you may not run into that in your use-case of course...

    • @boot-strapper
      @boot-strapper 3 ปีที่แล้ว

      @@TheApeMachine mature companies arent switching cloud providers. Also the billing is linear...

    • @TheApeMachine
      @TheApeMachine 3 ปีที่แล้ว

      @@boot-strapper recently spoke to a mature company who's clients at time demand certain providers or on premise systems. And linear pricing doesn't make something cheap. Like i said, depends on the use case

  • @JeremyAndersonBoise
    @JeremyAndersonBoise 5 ปีที่แล้ว +3

    One of the most hilarious presenters in tech. Love him.

  • @jmilkiewicz
    @jmilkiewicz 5 ปีที่แล้ว +7

    a messy mix of event sourcing and event driven architecture.

  • @kphamcao
    @kphamcao ปีที่แล้ว

    Apache Flink renders this talk obsolete.

  • @user-xb5lq6ve3n
    @user-xb5lq6ve3n 2 ปีที่แล้ว

    「あなたの動画はとても良いですし、メッセージがた

  • @richardwang3438
    @richardwang3438 3 ปีที่แล้ว +2

    works like a charm, super easy, oh let me take a drink

  • @humanyoda
    @humanyoda 4 ปีที่แล้ว +2

    So, in order for my bank to tell me my current account balance, it would have to process thousands of my transactions all the way to my initial deposit?! Wouldn't that slow down the system to a crawl?

    • @dmsanz_youtube
      @dmsanz_youtube 4 ปีที่แล้ว +3

      yes, but with CQRS your query would not be made against the "write" side but against the "read" side. The read side replays all the events in order to infer the balance and creates a read model. You can think of it as an infinite cache. So whenever you want to query info such as your balance, the operation will be lightning fast as the read model will be just a view ready to serve the info in the same way the info is required, without transformation of data on this specific operation and of course without join query on a database.
      In other words, CQRS will take care of this as long as you're willing to tolerate a bit of eventual consistency (i.e: the gap between the write transaction ends and the read model that projects the resulting event is ready)

    • @a0flj0
      @a0flj0 3 ปีที่แล้ว

      @@dmsanz_youtube I think a better answer would be that CQRS/event sourcing isn't necessarily the best solution for any problem. IME, whenever you need read optimization on real-time data, the best solution is a mix - you pre-aggregate old data.

  • @salvatoreshiggerino6810
    @salvatoreshiggerino6810 5 ปีที่แล้ว +1

    Optimistic Concurrency Control is nice in a lot of places, but in this case, isn't it easier to keep the state of the whole system (or at least the subsystem that needs to be consistent) in a single thread that you submit commands to that get validated and turned into events in a fixed order that is guaranteed to be valid? As he said, it's usually neither I/O nor CPU intensive, so that single thread is going to be fast enough to deal with an awful lot of traffic.

    • @oreoluwapeace510
      @oreoluwapeace510 4 ปีที่แล้ว

      That would fail in a load balanced system. multiple commands can still be processed on different instances of the service.

    • @a0flj0
      @a0flj0 3 ปีที่แล้ว

      Low reliability (you run on a single machine), limited scalability.

    • @salvatoreshiggerino6810
      @salvatoreshiggerino6810 3 ปีที่แล้ว

      @@a0flj0 Not unreliable at all. You start a new process with the same events and you end up with the same state. If recovery from failure needs to be instant, use a hot standby. In fact not having nearly as many moving parts makes it much more reliable.
      As for limited scalability, that's true. But anything up to a few hundred thousand transactions per second is perfectly feasible on one thread. And if it's a problem you can't scale with sharding, you're going to struggle with scaling no matter what architecture you use.

    • @a0flj0
      @a0flj0 3 ปีที่แล้ว

      @@salvatoreshiggerino6810 You need to put some transactional mechanism around event/message delivery, or else you risk loosing data, with a single thread. That's what I meant with not reliable. With such a mechanism in place, I doubt that a large number of events per second on a single thread are still feasible. The thread can easily handle the load, but most likely the network can't. Event sourcing without network involvement makes no sense to me. If you spread load among multiple instances, time and causal ordering are no longer simple issues, which makes the problem even bigger. But you are right, sharding can help a lot - if you can do reasonable sharding considering causal ordering.

    • @salvatoreshiggerino6810
      @salvatoreshiggerino6810 3 ปีที่แล้ว

      @@a0flj0 Those are already solved problems. Look up the LMAX architecture.

  • @alexd7466
    @alexd7466 4 ปีที่แล้ว +2

    he is making it more complicated than needed

    • @flolu
      @flolu 4 ปีที่แล้ว +2

      What does he over-complicate?
      I would be very interested which points can be simplified :)

  • @vorandrew
    @vorandrew 2 ปีที่แล้ว

    Must prohibit drinking during presentation

  • @nexus888
    @nexus888 5 ปีที่แล้ว +24

    mate, you are really thirsty! :)

    • @lucioluciolucio
      @lucioluciolucio 4 ปีที่แล้ว +3

      Kinda gross him gulping the water with the mic on

    • @zahirjacobs716
      @zahirjacobs716 3 ปีที่แล้ว

      probably a soothing behaviour.

  • @heiko3169
    @heiko3169 4 ปีที่แล้ว +1

    3:51 is basically a mis-conception of microservices at all

  • @socialcatalyst2608
    @socialcatalyst2608 2 ปีที่แล้ว +1

    English?)) No)) Engldeutsch )))) 👍

  • @CosasCotidianas
    @CosasCotidianas 3 ปีที่แล้ว +1

    The explanation is GOLD, but please let that bottle take a rest man.

    • @No1Melman
      @No1Melman 3 ปีที่แล้ว

      Not just me then 😅 did he go in super dehydrated or something 😅 Quality talk though!

  • @mati1979b
    @mati1979b 5 ปีที่แล้ว +1

    This is so super easy that I am not sure why there is even a presentation? :)

  • @alejandroagua5813
    @alejandroagua5813 4 ปีที่แล้ว

    His voice made me sleepy. Otherwise it's a nice talk

  • @stefian77
    @stefian77 5 ปีที่แล้ว

    Super easy, presentation smells bad. :)

  • @matrixlnmi169
    @matrixlnmi169 5 ปีที่แล้ว +6

    Suggestion- Next time avoid drinking water style showcase please , I was using headphone and yours water sound through your throat was annoying. BTW very good presentation

    • @aligorkemozturk
      @aligorkemozturk 5 ปีที่แล้ว

      Water sound is definitely disturbing when you are listening.

    • @zielu0987
      @zielu0987 5 ปีที่แล้ว +3

      Amazing suggestion - he is a human and after long talk he need to drink.

  • @oneagain6106
    @oneagain6106 4 ปีที่แล้ว +3

    Stopped listening after the second disgusting drinking sounds. Too much bragging for too little knowledge as well