Building Event Driven Services with Apache Kafka and Kafka Streams by Ben Stopford

แชร์
ฝัง
  • เผยแพร่เมื่อ 10 พ.ย. 2017
  • Event Driven Services come in many shapes and sizes from tiny functions that dip into an event stream, right through to heavy, stateful services. This talk makes the case for building such services, be they large or small, using a streaming platform. We will walk through a number of patterns for putting these together using a distributed log, the Kafka Streams API and it’s transactional guarantees.
    Ben Stopford
    Ben is an engineer and architect working on the Apache Kafka Core Team at Confluent Inc (the commercial company behind Apache Kafka). He's worked with distributed data infrastructure for over a decade, switching between engineering products and helping companies use them. Before Confluent he designed and built a central streaming database for a large investment bank. His earlier career spanned a variety of projects at Thoughtworks and UK-based enterprise companies. Find out more at benstopford.com.
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 21

  • @yadigarcaliskan6453
    @yadigarcaliskan6453 ปีที่แล้ว

    Great Ben, thanks

  • @sudonick2161
    @sudonick2161 2 ปีที่แล้ว

    This talk cleared a lot of my doubts related to EDA. Thanks Ben.

  • @someguy1714
    @someguy1714 6 ปีที่แล้ว +3

    Impressively well done presentation. Cleared up questions

  • @xprt642
    @xprt642 6 ปีที่แล้ว +5

    There is a pattern that I keep seeing in Kafka presentations. Everything is an Event, which is not quite that way. Everything should be a Message, with two very different types of messages: Commands and Events (much like in CQRS). Here, the "event" that is raised by the Webserver, OrderReceived, should be in fact a command, CreateCommand, that is sent to the Orders service. The good thing is that a command can be stored inside Kafka (as anything), the bad thing is that it should be removed once they are processed in order to not be processed again on replay but that depends on the actual implementation.

    • @stopfob
      @stopfob 6 ปีที่แล้ว +2

      Hi Containtin - I've gone back and forth on this. I actually think the C in CQRS is a bit unfortunate. What differentiates a command is really that it has a response (or at least an indication that the command completed): make payment, returning whether the payment was made. CQRS actually takes you away from that as the response is a separate call to the query side. So there is a good argument to say it's really more like an event. I generally choose to represent it like this as, for a lot of people, the distinction is quite academic. Learning the terminology doesn't buy you much in this case as you can understand and implement the pattern without it. Hope that makes sense. Let me know if you disagree.

    • @xprt642
      @xprt642 6 ปีที่แล้ว +4

      Hello Ben. First of all thanks for replying, it means a lot to me. What differentiate a Command from an Event is that a Command must reach a single endpoint and after that it should be discarded as its purpose has been fulfilled (unless some logging is needed); it must not be reprocessed again. It also expresses an intent, something that must be done; its effect are one or more Events, expressing facts about what changed. So, when we use the term "Event" we are expecting to see something that has already happened, immutable and unforgetable and not an intent to do something that might fail. I agree that an architecture where everything is an Event could function but you would loose the expressiveness of the code. Kafka, being so awesome, could be use to carry Commands and Events and we should refer to them as Messages, abstracting what they mean/what is their purpose and only emphasizing the fact that they carry Information. What that Information means (an intent xor a fact) is irrelevant to Kafka. All that Kafka promises to us is that it will deliver the message for sure, fast and in a scalable way.

    • @stopfob
      @stopfob 6 ปีที่แล้ว

      Yeah - that makes sense. I've typically build systems with event collaboration (martinfowler.com/eaaDev/EventCollaboration.html). The funny thing about that pattern is the event is essentially the command. At least EC systems don't differentiate between the two, but it's just a different style. Thanks for the thoughts.

  • @johnboy14
    @johnboy14 3 ปีที่แล้ว

    Its interesting how he recommends using Kafka for transactional purposes. Most technology stacks dont do this with kafka, its an entirely different way of thinking about distributing state. Every state change on an entity becomes an event and those events can be used to commit or rollback state.

  • @bicatu
    @bicatu 5 ปีที่แล้ว +11

    how can I stop him from moving around?

  • @StephanOudmaijer
    @StephanOudmaijer 6 ปีที่แล้ว +1

    Really interesting talk, liked it very much. The speaker did not go into details about the schema registry (~44 min). I am also curious about recommendations/guidelines for versioning of events (in topics). Where can I find an example for the lookup tables (~31min).

    • @stopfob
      @stopfob 6 ปีที่แล้ว +3

      Hi Stephan
      You can find out about the confluent schema registry here: docs.confluent.io/current/schema-registry/docs/index.html
      There are examples of creating views to do lookups etc here: github.com/confluentinc/kafka-streams-examples/tree/3.3.0-post/src/main/java/io/confluent/examples/streams/microservices
      There is also blog post that goes with those code samples (and this talk) here: www.confluent.io/blog/building-a-microservices-ecosystem-with-kafka-streams-and-ksql/

    • @maksimmuruev423
      @maksimmuruev423 4 ปีที่แล้ว

      I'll say after 50 min of watching this you cant reproduce even 'hello world' message with Kafka(no single example no package mentions just look from space). This is probably for people who know that Kafka is..and how to use it.. but why whey need this speech then?

    • @engineerhiteshahuja
      @engineerhiteshahuja 3 ปีที่แล้ว

      @@maksimmuruev423 that's not true. There are design patterns which he has spoke about which aren't been used by many people who are working on Kafka. There is stigma in the dev world that Kafka is just an another messaging queue. Speaker tried his best to explain all good things which Kafka ships and can be used. For example: state stores is a very powerful concept. If implemented properly, one might not even need a full fledged database.
      Bottom line: he tried explaining all the concepts and tools available related to Kafka in just 50 min.

  • @tekal85
    @tekal85 5 ปีที่แล้ว

    REST and Event-driven are not mutally exclusive. You can have a Spring REST API that processes User-Requests using a CQRS Style processing.

  • @scarface548
    @scarface548 6 ปีที่แล้ว

    How does "order validated" go to webserver( and users' browser) ? What is happening in users browser while all order validation is going on. What happens if user refreshers her browser?

    • @stopfob
      @stopfob 6 ปีที่แล้ว +2

      The webserver would poll the orders service with a long poll, or be pushed a notification over a websocket. There is an example of the former here: github.com/confluentinc/kafka-streams-examples/tree/4.0.0-post/src/main/java/io/confluent/examples/streams/microservices

  • @TJ-hs1qm
    @TJ-hs1qm 6 ปีที่แล้ว

    Compare this to REST, sounds horribly difficult to develop, test, debug and operate. With REST all I'd need is curl and swagger. I can fire up netcat with a mocked http endpoint in seconds. Now the protocol, what ever this is, needs to be deciphered, you cant do adhoc queries in production anymore as it seems, and who knows what node is actually responding to your events? Teams need to be trained and experienced enough to cope with all the fluffy stuff that might (and will) happen in production. And then you have Kafka sitting in the midst of everything and if that fails... wish me luck.

    • @stopfob
      @stopfob 6 ปีที่แล้ว +2

      I see it as a use case thing. If you're building a simple web app, all this infrastructure is unlikely to be worth the investment in the short term. As the ecosystem grows, particularly as more asynchronous use cases are added, more data needs to move from service to service and joined together, this type of architecture becomes worthwhile -- think Lambda with the tools needed for dealing with asynchronous systems and data movement. So I think there is an element of truth in what you say, but it's more a question of context. There is a little more discussion on the benefits here: www.confluent.io/blog/data-dichotomy-rethinking-the-way-we-treat-data-and-services/

    • @emmanueloverrated
      @emmanueloverrated 6 ปีที่แล้ว +1

      It might sound weird, but it get simpler to work on your system this way has it get bigger and bigger. Big systems tends to develop all kind of dependency pathology, even if you're putting a lot of effort at keeping it clean as strange business requirements breaks your design all the time.
      When you got systems with hundred of REST services talking to each other, chained in every possible ways and asked to figure out the call tree (service hops) for some complex requests. Lets say, a simple bug report tells you, a field in a page is blank and it shouldn't. Where do you start? How to you locate the problem if you aren't that corporate veteran who has a deep knowledge of the whole system?
      Now take, an event base system. Reproduce the bug in a the dev environment, look at every possible event topics, get the whole list of events emited by the request and locate where the data is lost in the chain. You'll figure out the culprit within minutes instead of days.

  • @maninderbatth_family
    @maninderbatth_family 6 ปีที่แล้ว +1

    boring, nothing thought provoking. Sure you can replicate address change by events, but events have lag.Which means, you can likely ship TV to old address and someone gets it for free :)

    • @AbhishekGupta-wc2ws
      @AbhishekGupta-wc2ws 5 ปีที่แล้ว +1

      Well, objects that are mutable in very frequent manner are preferred not to be replicated in other service, since the overhead of synchronizing data across two services would be much more than the benefit of autonomy & availability. For example: object like customer credit should not be replicated from customer service into order service, even though order service would need information on customer credit to perform business variants validation before placing order object into approved state. Hope that clarifies your issue of concerns with data replication.