Common Apache Kafka Mistakes to Avoid

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 มิ.ย. 2024
  • cnfl.io/podcast-episode-221 | What are some of the common mistakes that you have seen with Apache Kafka® record production and consumption? Nikoleta Verbeck (Principal Solutions Architect at Professional Services, Confluent) has a role that specifically tasks her with performance tuning as well as troubleshooting Kafka installations of all kinds. Based on her field experience, she put together a comprehensive list of common issues with recommendations for building, maintaining, and improving Kafka systems that are applicable across use cases.
    Kris and Nikoleta begin by discussing the fact that it is common for those migrating to Kafka from other message brokers to implement too many producers, rather than the one per service. Kafka is thread safe and one producer instance can talk to multiple topics, unlike with traditional message brokers, where you may tend to use a client per topic.
    Monitoring is an unabashed good in any Kafka system. Nikoleta notes that it is better to monitor from the start of your installation as thoroughly as possible, even if you don't think you ultimately will require so much detail, because it will pay off in the long run. A major advantage of monitoring is that it lets you predict your potential resource growth in a more orderly fashion, as well as helps you to use your current resources more efficiently. Nikoleta mentions the many dashboards that have been built out by her team to accommodate leading monitoring platforms such as Prometheus, Grafana, New Relic, Datadog, and Splunk.
    They also discuss a number of useful elements that are optional in Kafka so people tend to be unaware of them. Compression is the first of these, and Nikoleta absolutely recommends that you enable it. Another is producer callbacks, which you can use to catch exceptions. A third is setting a `ConsumerRebalanceListener`, which notifies you about rebalancing events, letting you prepare for any issues that may result from them.
    Other topics covered in the episode are batching and the `linger.ms` Kafka producer setting, how to figure out your units of scale, and the metrics tool Trogdor.
    EPISODE LINKS
    ► 5 Common Pitfalls When Using Apache Kafka: cnfl.io/5-common-pitfalls-whe...
    ► Kafka Internals course: cnfl.io/internals-101-episode...
    ► linger.ms producer configs.: cnfl.io/linger-ms-episode-221
    ► Fault Injection-Trogdor: cwiki.apache.org/confluence/d...
    ► From Apache Kafka to Performance in Confluent Cloud: cnfl.io/journey-from-apache-k...
    ► Kafka Compression: cwiki.apache.org/confluence/d...
    ► Interface ConsumerRebalanceListener: kafka.apache.org/24/javadoc/i...
    ► Nikoleta Verbeck’s Twitter: / nerdynikoleta
    ► Kris Jenkins’ Twitter: / krisajenkins
    ► Streaming Audio Playlist: • Streaming Audio Podcas...
    ► Join the Confluent Community: cnfl.io/confluent-community-e...
    ►Learn more with Kafka tutorials, resources, and guides at Confluent Developer: cnfl.io/confluent-developer-e...
    ► Use PODCAST100 to get $100 of free Confluent Cloud usage: cnfl.io/try-cloud-episode-221
    ► Promo code details: cnfl.io/podcast100-details-ep...
    TIMESTAMPS
    0:00 - Intro
    1:17 - What is a Solutions Architect
    2:20 - It's a problem to use multiple producers in a single service
    6:19 - The trade off between throughput and latency with batching
    8:05 - What is linger.ms
    15:00 - Enable compression
    25:19 - Define Producer Callbacks
    33:16 - One consumer per thread in a single service instance
    41:45 - Trogdor
    43:37 - Over Committing
    55:48 - Provide a `ConsumerRebalanceListener`
    1:00:16 - Undersized per Kafka Consumer instances
    1:07:28 - It's a wrap
    ABOUT CONFLUENT
    Confluent is pioneering a fundamentally new category of data infrastructure focused on data in motion. Confluent’s cloud-native offering is the foundational platform for data in motion - designed to be the intelligent connective tissue enabling real-time data, from multiple sources, to constantly stream across the organization. With Confluent, organizations can meet the new business imperative of delivering rich, digital front-end customer experiences and transitioning to sophisticated, real-time, software-driven backend operations. To learn more, please visit www.confluent.io.
    #streamprocessing #apachekafka #kafka #confluent
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 14

  • @ragingpahadi
    @ragingpahadi 4 หลายเดือนก่อน +2

    Very informative video 🎉

  • @mateuszkopij4120
    @mateuszkopij4120 2 ปีที่แล้ว +5

    As always, great tips, thanks!

  • @diegoferreirati
    @diegoferreirati 9 หลายเดือนก่อน +1

    Amazing talk! Keep up

  • @HenrykSzlangbaum
    @HenrykSzlangbaum 2 ปีที่แล้ว +2

    Great discussion

  • @mathieugauthron3744
    @mathieugauthron3744 4 หลายเดือนก่อน

    Kris, you're a star. Great video.

  • @debabhishek
    @debabhishek 29 วันที่ผ่านมา

    all the points are interesting.. I was thinking if after consumer fetch if we can explore the threadpool option to speed up the processing speed , got a validation here.. another interesting point is over commit by the consumers.. so does it means that I dont need to commit ( or ack) every record .. suppose my consumer is reading from Topic A and B ( both having two partitions) its enough to commit for the last offset of A1 A2, B1 and B2 . though I am processing more records from these topic partitions I am committing ( ack-ing ) the last offset for each partition. @confluent please correct me if I am wrong

  • @debabhishek
    @debabhishek 29 วันที่ผ่านมา

    one little details I am searching about fetch or consumer poll . consumer is subscribed to more than one topic or 1 topic ..--> more than one partition , now the leaders for the partition are in different brokers.. ( dont know if you read from leaders or from isr list) ,, even if you read from the isr's , they may fall in different brokers.. .. what consumer do in such cases forward more than one request in different brokers and collate the results and present it to the client ? what if one broker is responding slow .. or not responding at all .. .. if responding slow consumer is ignoring it , it my keep on respond slow. and will be silently ignored. .. can you please write one two lines about consumer fetch.

  • @HenrykSzlangbaum
    @HenrykSzlangbaum 2 ปีที่แล้ว +3

    Ya, I'm sure ppl only start caring about batching sizes only after buying millions in hardware

  • @abhinee
    @abhinee 2 ปีที่แล้ว

    What a great discussion

  • @anandperi7060
    @anandperi7060 7 หลายเดือนก่อน

    I believe the compression has to happen at the individual message/event level else they can't be written to correct partition. Not sure if the compress happens at the batch level as the talk is leading us to believe.

    • @krisajenkins
      @krisajenkins 7 หลายเดือนก่อน +1

      No, that's not correct. The producer allocates a batch per partition for exactly this reason. Compression happens at the batch level, before the batch is sent to its allocated partition.
      Trust me, Nikoleta knows this stuff inside out. 🙂

  • @lexluther-1169
    @lexluther-1169 ปีที่แล้ว +3

    The more I learn the less I know about kafka.

  • @anandperi7060
    @anandperi7060 7 หลายเดือนก่อน

    I'm new to Kafka but auto scaling is such a basic concept now a days ... why can't you add more brokers and disk if the load is increasing based on some metric and scale back later. Agreed some rebalancing of partitions etc needs to happen to the scale down may not be as simple but that is because Kafka seems to have coupled compute with storage its in their architecture. Having a side cluster and everything I hear seems ugly IMO.