Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or Frenemies?

แชร์
ฝัง
  • เผยแพร่เมื่อ 6 มี.ค. 2019
  • This session discusses how to build an event-driven streaming platform leveraging Apache Kafka’s open source messaging, integration and streaming capabilities.
    Learn the differences between an event-driven streaming platform and middleware like Message Queues (MQ), Extract-Transform-Load Tools (ETL) and Enterprise Service Bus (ESB) - including best practices and anti-patterns, but also how these concepts and tools complement each other in an enterprise architecture.
    Key takeaways for the audience
    • See typical middleware requirements and challenges in an enterprise architecture regarding messaging, integration and stream processing
    • Learn about the components of open source project Apache Kafka as event streaming platform, including messaging, integration, stream processing, and community add-ons like pre-built connectors and the Schema Registry
    • Learn the differences between a native event-driven streaming platform leveraging Apache Kafka and middleware like MQ, ETL and ESBs - including best practices and anti-patterns, but also how these concepts and tools complement each other in an enterprise architecture.
    More information:
    www.confluent.io/blog/apache-...
    www.slideshare.net/KaiWaehner...
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 79

  • @abobakrnasr9814
    @abobakrnasr9814 3 ปีที่แล้ว +5

    Excellant video Kai, very informative and you explained it in a very nice way. The presentation just put the right context for me to understand the new era technologies. Thank you so much and waiting for more videos

  • @sitaluk21
    @sitaluk21 2 ปีที่แล้ว +1

    This is an excellent presentation, putting this whole thing like a story. Amazing articulation 👏 thank you very much 😊

  • @philosopher46
    @philosopher46 4 ปีที่แล้ว +2

    So long but worth watching it, thank you ka

  • @welbsantos
    @welbsantos 2 ปีที่แล้ว

    Excellent video !!! It was very helpful!

  • @santiagopouget3473
    @santiagopouget3473 3 ปีที่แล้ว +2

    Dear Kai, my congratulation about this material, i was a really good base for non technical people like. Good and great job (accurate, realistic and technical based). Thank you

  • @srinub523
    @srinub523 4 ปีที่แล้ว

    Very clear explanation. Thank you.

  • @gregkall7374
    @gregkall7374 4 ปีที่แล้ว +3

    Terrific overview, Kai!

  • @ralfik14
    @ralfik14 4 ปีที่แล้ว

    Very good movie. Synthetic knowledge provided in easy way. I recommend to watch it for everybody interested in modern ways of data processing.

  • @augustohdzalbin
    @augustohdzalbin 4 ปีที่แล้ว +1

    Good explanation. Thanks, Augusto

  • @shravanparepally3551
    @shravanparepally3551 3 ปีที่แล้ว +5

    If I'm clearing my interview at big 4, consider my big thank you already :-) this presentation is an excellent summary

  • @sukumard
    @sukumard ปีที่แล้ว

    Thanks for this Kai, very very useful to explain to a 175 year old company why they need to change their ways of working to enter the new Digital way of doing things.

  • @elodiechaumet-doucet1045
    @elodiechaumet-doucet1045 2 ปีที่แล้ว

    Super! Got a middleware past and need to understand the concept of Events integration and how it could be frenemy with existing middleware.

  • @jksharma7
    @jksharma7 3 ปีที่แล้ว

    Thank you for a wonderful knowledge Sir

  • @doctari1061
    @doctari1061 3 ปีที่แล้ว

    Very nice. Thanks

  • @JohnTube2K
    @JohnTube2K 2 ปีที่แล้ว

    Nice video. This is what I come across as an EA at my job, there is already a sunk cost in legacy technology and it’s a struggle to get business to update their technology unless there is a true business or technical need.

  • @ammarhassan4571
    @ammarhassan4571 3 ปีที่แล้ว

    Indeed very informative, especially the conclusion the right tool for right job, because this is where Architects and companies can't decide well and try to do every job in same tooling and ends up in having very high maintenance code/projects, and sometimes integration layer become bottle neck issues and stop the business from achieving agility.

  • @visasimbu
    @visasimbu 4 ปีที่แล้ว +1

    Good info.

  • @JohnTube2K
    @JohnTube2K 2 ปีที่แล้ว

    subscribed!!

  • @deonvanniekerk871
    @deonvanniekerk871 3 ปีที่แล้ว

    Great video, really explained it well. Appreciate your contribution.

  • @maxmag76
    @maxmag76 4 ปีที่แล้ว

    Really Nice explanation! Thank you

  • @1m1r0z
    @1m1r0z 3 ปีที่แล้ว

    Thank you Kai

  • @JoaquinPonte
    @JoaquinPonte 5 ปีที่แล้ว +5

    This video is amazing, you presented everything in a clear way, congratulations

  • @sujukrish
    @sujukrish 3 ปีที่แล้ว

    very nice

  • @KarstenHeymann
    @KarstenHeymann 4 ปีที่แล้ว +4

    Just a very small note from a fellow german: "Event" is pronounced "Iwänt" with an emphasis on the "ä".

    • @krystianfeigenbaum238
      @krystianfeigenbaum238 3 ปีที่แล้ว +1

      another fellow German here - the problem is where he stresses. -> [i'wänt] (not ['i:wänt])

    • @raydickenson6511
      @raydickenson6511 3 ปีที่แล้ว

      @@krystianfeigenbaum238 You guys are focusing on a slightly-off pronunciation of "event" but saying nothing about the very-wrong pronunciation of "paradigm" (the "g" is silent). I really appreciate the video, Kai!

  • @marsimark
    @marsimark 4 ปีที่แล้ว +1

    A/B testing for middleware services is a new idea to me

  • @ComisarioLobo
    @ComisarioLobo 5 ปีที่แล้ว +2

    Nice video Kai, how does Kafka compare to Apache Pulsar and Nats? Also when Kafka is gonna be fully cloud native?

    • @kaiwaehner5702
      @kaiwaehner5702 5 ปีที่แล้ว +2

      Thanks for the feedback, Santiago.
      My thoughts about your question are opinionated, of course, as I work for Confluent.
      Pulsar has a pretty similar approach to Kafka with some pros and cons. I think the main difference is that Kafka is battled-tested and adopted all over the world in almost any big company. Pulsar is used in a few projects, but you really need to find a good reason not to use Kafka.
      This Twitter post from December 2018 is a nice story around this discussion: blog.twitter.com/engineering/en_us/topics/insights/2018/twitters-kafka-adoption-story.html
      I don't know NATS well. But I think the key difference is that it is a messaging platform while Kafka is an event streaming platform (including messaging, storage and processing). Therefore, Kafka is used for much more than just messaging today. For instance, almost every microservice architecture is built using Kafka because it decouples the microservices well with its feature set like storage, event log and really decoupled producers / consumers.
      We are not far away from Kafka being cloud-native. Stay tuned and follow the announcements and KIPs (Kafka Improvement Proposals) of next few Kafka releases.

  • @tomw0815
    @tomw0815 3 ปีที่แล้ว +1

    Nice explanation with a good example of the Host-unloading. But this is "only" the technical side. I assume that the most complex part will be the business definiton of "what is an event", "what data is needed for other systems to process an event", "how do you ensure a certain order of event processing through systems that listen to the event stream - first system A, then System C, then B"? It's not done putting some database changes in the event stream. No one on the other side can do anything without more process context of that changes.

    •  3 ปีที่แล้ว +1

      It really depends on the use case. In some cases, it is "that simple". But in other cases, in some scenarios, you don't integrate with the IBM DB2 on the mainframe or the SAP Hana directly (for the reasons you described), but integrate with a high-level API, for instance, SAP's business APIs like BAPI or iDoc.

  • @sunildevan
    @sunildevan 5 ปีที่แล้ว

    Really a good one.
    What program languages it has been built using. Trying to understand what skills would one need to extend or customize?

    • @theoquasi
      @theoquasi 4 ปีที่แล้ว

      sunil dev COBOL

  • @ayyapusettykiran7005
    @ayyapusettykiran7005 5 ปีที่แล้ว

    Hello Kai Wahner,
    I just want to know that do Kafka handles the SCD type 1, Type 2 which we handle in ETL, other than this we will use ETL to do lot more, where we implement many Data Warehousing Concepts. Can we achieve all those through Kafka?

    •  5 ปีที่แล้ว

      Well, it depends on what exactly you want to do. Kafka is not a replacement for a traditional data warehouse. However, you can easily built a real time streaming infrastructure, which can also handle and store state and structures for further analysis.
      Slowly changing dimension (SCD) type 2 (add new row) is the default of Kafka's log. Type 1 (overwrite) is provided with the Kafka feature "compacted topics". Type 3 (new attribute) is provided with Confluent Schema Registry and Apache Avro, which is built in.
      Thus, many features you are looking for, are built into Kafka's core, but you still need to doublecheck if it is the right tool for your use case.

  • @srik006
    @srik006 3 ปีที่แล้ว +1

    Excellent video.. One question. So you recommend kafka only for read ? what about write ? can we write into KAfka and which intrun updates the mainframe ?

    •  3 ปีที่แล้ว +1

      @Srikant: No. Read and offloading is just the simplest part of mainframe integration. Please check out the following material (blog, slides, video) about bi-directional integration between Kafka and Mainframe:
      www.kai-waehner.de/blog/2020/04/24/mainframe-offloading-replacement-apache-kafka-connect-ibm-db2-mq-cdc-cobol/

    • @srik006
      @srik006 3 ปีที่แล้ว

      Kai Wähner thanks .

  • @ashuetrx
    @ashuetrx 3 ปีที่แล้ว +1

    What is the rate that the traditional MQs can handle & Apache Kafka ?
    I understand it will also depend on the consumer rate but lets say take an average consumer .
    Rate of Kafka answered @ 18:08

    •  3 ปีที่แล้ว +1

      The short answer is that traditional MQ can handle around 100-1000 messages per second per broker, Kafka factor 100 (= 10000-100000 and more) per broker. Some more modern MQ systems like RabbitMQ scale better, but still provide various limitations regarding scalability, reliability, etc. compared to Kafka. The throughput is similar. Kafka can process Gigabytes per second in a cluster. MQ systems were not built for scale. Hence, each MQ deployment is typically its own. Thus, you would have to deploy 100s of MQ systems to get the scale of one Kafka cluster.

  • @carloskassab2294
    @carloskassab2294 4 ปีที่แล้ว

    Hi, I am new on data streaming technologies. I need to build a real-time ETL solution for data migrations, how do you compare between Kafka, Akka streams and Spark streaming in terms of cost, development time and effort needed, maintenance effort needed?
    Which would you recommend to build my real-time ETL solution?
    Thank you in advance for your response.

    •  4 ปีที่แล้ว +1

      There is no general answer, unfortunately. I recommend to ask yourself these questions:
      - What are the required SLAs (e.g.zero downtime and zero data loss?)
      - Do you really need Akka or Spark in addition to Kafka or can Kafka and its ecosystem (e.g. Kafka Streams / KSQL) do the job; resulting in less complex deployment / testing / support for multiple systems; sometimes yes, sometimes no
      - What do you want to do with the data? Often, different consumers want to consume the data (maybe not in the first project, but in the second). Does it make sense to consume data data from a data lake (Spark stores data at rest in HDFS or S3 buckets); or do you want to have the option to consume in real time, near real time or batch with any technology and programming language (this is what Kafka provides).
      Discussing and answering these questions helps finding your answer. In short, the more mission-critical SLAs you need, less systems should be used in the middle for 24/7 deployments.

  • @dmitriishapkin8578
    @dmitriishapkin8578 3 ปีที่แล้ว

    Hi, could you explain why having one integration team instead of keeping specialists in development teams has become an anti pattern? Thank you

    •  3 ปีที่แล้ว

      In most bigger organizations (or even projects), it creates a single bottle neck (both people and technology). For smaller projects, one single integration team is fine, of course.

  • @TheGeoDaddy
    @TheGeoDaddy 3 ปีที่แล้ว +1

    My basic question is - what is in Kafka - that you can’t “code your own” if you already have MQ implemented on the IBM mainframe and on Websphere across a distributed platform? Yes, Kafka is bullet tested and provided a higher level user interface/functionality. But there’s something to be said about keeping knowledge of something so - application dependent - in-house...

    •  3 ปีที่แล้ว +1

      As always in software business, it depends, there is no short answer to this question. Of course, you can always code your own stuff.
      Against IBM MQ, #1 is scalability - i.e. if you need to process high volumes of data. That cannot be done with IBM MQ.
      However, also all the additional stuff like data processing and data integration in Kafka help "out-of-the-box" in a reliable and scalable way. You can code your own solution (or combine different products from the IBM Websphere portfolio). That is more complex, probably more expensive, and does not scale well (for many use cases).
      To be clear, there is also some use cases which only IBM MQ can handle. For instance, if you need transactional integration with IMS on the mainframe, then Kafka is probably not the right choice, but more complementary. I see many customers integrating IBM MQ and Kafka (e.g. via Confluent's IBM MQ Connector) to leverage the best of both worlds.

    • @TheGeoDaddy
      @TheGeoDaddy 3 ปีที่แล้ว

      Thank You (Viele Dank?)
      Asked the question - BEFORE - getting thru the entire video and that was pretty much the Caveat at the end...I was specifically thinking about the last scenario where the Bank still uses IMS as its main data repository that runs the critical batch every night and has all the “answers” by the AM. The bank uses Kafka but more for fringe applications and experimentations... but the “meat & potatoes” is millions of transactions coming in during the - in bitstrings that requires Assembler code to pre-process before we can even use COBOL to update IMS... and that has years of trial and error and fixes that would all have to be experienced again... going to any other system... because NO ONE really knows ALL it does... it just works... and we can ETL, MQ and Kafka the results. 😏

    •  3 ปีที่แล้ว +1

      @@TheGeoDaddy Also check out this blog (including slides and video) for more details about mainframe integration (it even shows a 3rd party tool that can do transactional end-to-end integration between Mainframe and Kafka):
      www.kai-waehner.de/blog/2020/04/24/mainframe-offloading-replacement-apache-kafka-connect-ibm-db2-mq-cdc-cobol/

    • @TheGeoDaddy
      @TheGeoDaddy 3 ปีที่แล้ว

      Thanx!

  • @vadymmishchenko67
    @vadymmishchenko67 3 ปีที่แล้ว

    I don't understand how is for example Tibco EMS, or Tibco RV(no broker) or ActiveMQ has less of event base nature than a Kafka ?
    where is the difference ?

    •  3 ปีที่แล้ว

      Both are event-based. There is no difference regarding this paradigm. However, Kafka is not just a message queue (where messages are deleted when consumed). Instead, Kafka also persists the data after consumption. This way one consumer can consume in real-time while another one at a later point in time (batch, request-response, etc). This way, you get real decoupling between producers and consumers. Other differences to TIBCO and ActiveMQ are much higher scalability, rolling upgrades, etc.
      Also check out this article for more details:: www.confluent.io/blog/apache-kafka-vs-enterprise-service-bus-esb-friends-enemies-or-frenemies/

  • @timbeil
    @timbeil 3 ปีที่แล้ว

    Do you know Smalltalk(-80)? Events and Messages are allready out there all the time, long ago.

    •  3 ปีที่แล้ว

      Did you get watch the video? I am also saying that messaging exists for 20+ years. And if you compare the content of this talk to Smalltalk, then you compare not apples and oranges, but apples and chocolate.

  • @normanfung7124
    @normanfung7124 3 ปีที่แล้ว

    34'15" Kafka not best for real time communication with latency micro seconds range..? I haven't used Kafka but like to hear more on this.

    •  3 ปีที่แล้ว +1

      It is always important to define the term and requirement „real-time“. Kafka can process events end-to-end (from producer via broker to consumer) in 10+ milliseconds (even millions of events per second). Hence, it you need faster processing (e. g. trading on the stock market), other (proprietory) products have to be used.

    • @Pifagorass
      @Pifagorass 2 ปีที่แล้ว

      Real time doesn't mean low latency. For low latency solutions can look at

    • @Pifagorass
      @Pifagorass 2 ปีที่แล้ว

      @ www.quora.com/What-are-some-alternatives-to-Apache-Kafka/answer/Pranas-Baliuka?ch=10&oid=279463270&share=1452513f&srid=hM0m&target_type=answer

    • @Pifagorass
      @Pifagorass 2 ปีที่แล้ว

      Active - passive is legacy. Read about Island architecture before throwing such phrases as marketing pitch ...

    • @Pifagorass
      @Pifagorass 2 ปีที่แล้ว

      'Kafka as cashing layer' - hmmmm? I'll not donwote, but next time consider giving marketing slides for review by technical team ;)

  • @mitenmehta79
    @mitenmehta79 4 ปีที่แล้ว +1

    whole agenda was described but not really clear even till end how existing middleware can be replaced easily with kafka without any loss of features. also its pull only so if existing clients are using push based then how to replace easily ?

    •  4 ปีที่แล้ว

      This is just a high level talk. I did not cover a specific replacement in detail. It depends on how much of the existing middleware you want or need to replace. In general, Push vs. Pull has important differences and the Kafka API provides technical details to handle things as you expect it. But on a high level you just replace the JMS Push-based consumer with the Kafka Pull-based consumer and still receive all messages.
      For example, you could replace a JMS-based broker with Kafka completely and just use the Confluent JMS Client instead of the existing one (docs.confluent.io/current/clients/kafka-jms-client/index.html). It implements JMS with Kafka under the hood. This way you don't even need to change the client implementation (but note that it has some limitations in feature support).
      Another option is to keep the existing producers running to the JMS broker but just consume from a Kafka consumer instead with the general Kafka Consumer API.

    • @gopalakrishnans2003
      @gopalakrishnans2003 3 ปีที่แล้ว

      Thank you. I think Workday Studio also uses Kafka on Confluence. I'm a beginner here. Very interesting presentation. How can I learn more / start from zero on Apache Kafka.??

  • @p0rti100
    @p0rti100 4 ปีที่แล้ว +6

    Nice slides... Small typo that made me laugh: "Eat your own dog GOOD"!

  • @onewizzard
    @onewizzard 3 ปีที่แล้ว +1

    What happens when you need production support due to a bug?

    •  3 ปีที่แล้ว +2

      Various vendors support Apache Kafka, including Confluent, IBM, TIBCO, Cloudera, and others. The quality of support and expertise differs significantly. Also, most vendors don't support the full Kafka solution but exclude features like Kafka Streams or Exactly-Once-Semantics. Confluent is the leading Kafka vendor due to its huge commitment and focus on this Apache project.

    • @onewizzard
      @onewizzard 3 ปีที่แล้ว

      @ we use Spring Framework, I'd be interested in that tutorial

  • @Calphool222
    @Calphool222 3 ปีที่แล้ว +1

    "paradigm" is pronounced "peh-ruh-daim" not "para-dig-um"

  • @jianchiwei5379
    @jianchiwei5379 2 ปีที่แล้ว

    Event: pronounce it stressing the second syllable; so is the word development

  • @danielkrajnik8865
    @danielkrajnik8865 3 ปีที่แล้ว

    ooh its event that doesnt sound new at all

  • @Pifagorass
    @Pifagorass 2 ปีที่แล้ว

    Read Kafka topic again and again for each training epoch ... It wasn't good example at all ☺️

  • @Pifagorass
    @Pifagorass 2 ปีที่แล้ว

    95% correct, but for the rest I was tempted to thumbs down. E.g. 'legacy active passive' or 'machine learning training reading Kafka over and over again', or let's use Kafka as 'cashing laeyer' for directly queries by web clients, or one can easily migrate from Confluent platform to Apache Kafka open source ...

    •  2 ปีที่แล้ว +1

      There are always different perspectives. Each of the points you mention is worth its own discussion or presentation. And terms like "easily" can be interpreted differently. Hence, thanks for your comment. I will try to be more clear on these topics in future talks.

    • @Pifagorass
      @Pifagorass 2 ปีที่แล้ว

      @ yes, consider expanding on such statements. E.g. how Kafka can contribute in ML or why one would like RAFT for leader election. It's never easy to escape vendor lock-in and should not be part of genuen marketing and explanation of benefits having commercial solution would be not deceving direction.

    •  2 ปีที่แล้ว +1

      @@Pifagorass For ML, I have plenty of blog posts and videos. Just google for "Kafka Machine Learning". Here is a talk that also covers model training from Kafka topics in more detail: th-cam.com/video/Ug7sOMWUUak/w-d-xo.html
      For the vendor lock-in discussion, it is not just marketing. I agree that it is not for free (meaning it takes some efforts), but it is much easier to migrate from Confluent Platform to Apache Kafka than e.g. from Oracle to MySQL or IBM MQ to RabbitMQ as the heart of CP is AK, i.e. the same code and infrastructure. But I agree that it is important to point out that the migration is not just a button click :-)

    • @Pifagorass
      @Pifagorass 2 ปีที่แล้ว

      @ more focused video makes sense not in for 'consume from Kafka over and over again' can be intercepted as replaying topic for each training epoch. Thanks 👍 for the more specific video.

  • @kiliandietrich8526
    @kiliandietrich8526 3 ปีที่แล้ว

    Great content, really great video! You could work on your pronounciation however...

  • @huyenhuyen4091
    @huyenhuyen4091 2 ปีที่แล้ว

    the sound is not good I am quite disappointed