System Design: Why is Kafka fast?

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 มิ.ย. 2022
  • Weekly system design newsletter: bit.ly/3tfAlYD
    Checkout our bestselling System Design Interview books:
    Volume 1: amzn.to/3Ou7gkd
    Volume 2: amzn.to/3HqGozy
    Other things we made:
    Digital version of System Design Interview books: bit.ly/3mlDSk9
    Twitter: bit.ly/3HqEz5G
    LinkedIn: bit.ly/39h22JK
    ABOUT US:
    Covering topics and trends in large-scale system design, from the authors of the best-selling System Design Interview series.

ความคิดเห็น • 460

  • @ByteByteGo
    @ByteByteGo  ปีที่แล้ว +225

    Subscribe and Kafka will say thank you :)

    • @tubenzr
      @tubenzr ปีที่แล้ว

      ok, it's done Sir

    • @DrRishabhGarg
      @DrRishabhGarg ปีที่แล้ว +3

      What software do you use to create this awesome motion graphics?

    • @rpidugu99
      @rpidugu99 ปีที่แล้ว +4

      May I know what tool you guys use to make these animated videos? Just curious..!!

    • @ropro9817
      @ropro9817 ปีที่แล้ว +1

      I just discovered this video in my feed. _Sometimes_ the TH-cam algorithm actually works! 🤠Great video! I just subscribed to your channel!

    • @colossus95
      @colossus95 ปีที่แล้ว

      I wish you were my professor in college.

  • @jay6645
    @jay6645 ปีที่แล้ว +1732

    The absence of any background music makes this video great.

  • @kurtmueller2089
    @kurtmueller2089 ปีที่แล้ว +498

    What an amazing tutorial: Just the necessities, no annoying background music, no annoying calls to "subscribe and like".
    If all youtube channels were like that, we could heal the world.
    Also, I checked your channel page and was shocked to find that this was only your 3rd video.
    Keep being awesome!

    • @martinmusli3044
      @martinmusli3044 ปีที่แล้ว +10

      This Tutorial is insanly "Zen" but he said "please subscribe" right at the end :P

  • @ridealongreactions2601
    @ridealongreactions2601 ปีที่แล้ว +40

    I 100% believe you should make a whole series on Kafka, your way of simplifying the subject is legendary.

  • @Spiritualleace
    @Spiritualleace ปีที่แล้ว +31

    How can one keep things so deep and yet stunningly simple. Hats off!

  • @nemeziz_prime
    @nemeziz_prime ปีที่แล้ว +240

    These videos are amazingly simple and clear. The animations are spot on!! Too good xD I wish this channel never stops uploading new content

  • @ervamate
    @ervamate ปีที่แล้ว +15

    Mentioned a lot in the comments, but I have to say as well: what a great explanation, straight to the point, no bs and gives enough info without overwhelming with details. Thank you!

  • @Athmarr
    @Athmarr ปีที่แล้ว +5

    I have used kafka before but never had to think about why it is actually fast. This was very informative. I like the format of the video as well

  • @dishantchauhan4775
    @dishantchauhan4775 ปีที่แล้ว +6

    Seriously, thanks a lot Alex for all the stuff you convey through your LinkedIn network and TH-cam videos. Just love the way you distil the topics and make them understand beautifully.

  • @pranavamazon5937
    @pranavamazon5937 ปีที่แล้ว +2

    this guy is so sweet. man! i was struggling on this system design, all his books and posts are too easy to follow and helped me become more confident

  • @MrRunchSlam
    @MrRunchSlam ปีที่แล้ว +8

    You guys are doing amazing work here. I love the aesthetics, pace, explanations, topics, and cadence of it all. Kudos!

  • @severtone263
    @severtone263 ปีที่แล้ว +1

    No frills and thrills, just pure nuggets of value. Exactly what I needed. Thank you. You earned my sub.

  • @fripickbot4043
    @fripickbot4043 ปีที่แล้ว +1

    Man this is gold. Saying thank you does not feel enough. Please keep it up.

  • @nishantparmar
    @nishantparmar ปีที่แล้ว +41

    Short, high quality, clean and extremely precise content...Many Thanks!

  • @DanteS-119
    @DanteS-119 ปีที่แล้ว

    Lol, I heard you talk for about 10 seconds and subscribed. The tone of voice, the kind of explanation, the details, the video content, all of that quality conveyed in just a few seconds. Excellent content. Great stuff.

  • @SaitamaTheLegend
    @SaitamaTheLegend ปีที่แล้ว +8

    In 5 minutes I learned a lot! Amazing video!
    You are a good teacher!
    Thank you and I hope to see more videos from you!

  • @ducquang980
    @ducquang980 ปีที่แล้ว +4

    Short, concise and concrete. Very easy to understand. Thanks a lot

  • @sami9323
    @sami9323 ปีที่แล้ว +2

    Absolutely fantastic video - went over a lot of concepts like minimizing disk io, engineering constraints of kafka, different memory access patterns, with very good diagrams! Thank you :)

  • @Drdemiurge
    @Drdemiurge ปีที่แล้ว +1

    So glad the algorithm found this channel for me, the content is so clear and digestible, thank you please keep up the fantastic work

  • @aryanrahman3212
    @aryanrahman3212 ปีที่แล้ว +4

    Really great presentation! I was scared when I saw Kafka but you explained it really well.

  • @toukaK
    @toukaK ปีที่แล้ว +6

    excited to see Sahn on youtube!
    this is by far the best tech video I've watched. concise without losing any depth! looking forward to more videos like this.
    I've had the fortune to (indirectly) work with Sahn and review his code. one of the few top talents that any company is lucky to have. this video is as high quality as other production of his.
    2 questions for Sahn:
    1. there's a small disconnection between "sequential IO throughput vs random IO throughput" and "HDD vs SSD". is there any perf number difference on sequential IO throughput on HDD vs SSD?
    2. is there any perf number difference(ops per sec or latency) for zero-copy vs traditional buffer copies?

  • @fahuwayne8067
    @fahuwayne8067 ปีที่แล้ว +1

    You have a extremely clear and nice way to talk and explain! Please make more videos like that. Awesome work!

  • @riadhgharbi7985
    @riadhgharbi7985 ปีที่แล้ว +1

    Very simple and efficient execution, talking about both the video and Kafka. Really good material mate, keep up the good work

  • @mwaikul
    @mwaikul ปีที่แล้ว +1

    Amazing! Love the quality and getting straight to the point. Not a second wasted.

  • @jiajunc-yw3rn
    @jiajunc-yw3rn 2 หลายเดือนก่อน

    You made me realize the importance of expressing thought in a clear and concise way. Thank you

  • @jigneshnakhva1546
    @jigneshnakhva1546 ปีที่แล้ว +1

    I love all the System-design Content posted by you!
    Thanks for sharing your knowledge! 🙏

  • @_Documentation
    @_Documentation ปีที่แล้ว +1

    Succinct.
    Precise.
    Educative.
    Excellent animation.
    Simply the best 💯

  • @sherhy3689
    @sherhy3689 ปีที่แล้ว +1

    i wanted to comment that i appreciate the level of detail in the explanations in the video.
    looking forward to more useful content!

  • @ChandraShekhar-by3cd
    @ChandraShekhar-by3cd ปีที่แล้ว +3

    Loved the animation and explanation. Keep enlightening us all!

  • @143Support
    @143Support 7 หลายเดือนก่อน +1

    This is not the same Kafka I was expecting, but happy to learn. thanks for sharing!

  • @gopalkrushnapattanaik3232
    @gopalkrushnapattanaik3232 ปีที่แล้ว +1

    Short ,Crisp and To the point contents , Great work !!

  • @GiacomoPetronio
    @GiacomoPetronio ปีที่แล้ว +1

    5 minutes of high quality content, thanks!

  • @RunOfTheTrill
    @RunOfTheTrill ปีที่แล้ว +3

    A truly educational and concise video.
    Thank you.

  • @tomislavkristianoliveirabi9873
    @tomislavkristianoliveirabi9873 ปีที่แล้ว +2

    Exactly my kind of content. Interesting, insightful and to the point.

  • @andyserrato
    @andyserrato ปีที่แล้ว +1

    So simple yet so powerful explanation, thanks

  • @constantfear
    @constantfear ปีที่แล้ว +1

    Thanks, brilliant tutorial. My company are currently gearing up to adopt a data mesh architecture and It's gonna be fun moving from batch to this CDC stream methodology.

  • @Youvko
    @Youvko ปีที่แล้ว

    Wow, this one is super cool. No background music, cool minimalistic diagrams, calm voice!

  • @vikingthedude
    @vikingthedude ปีที่แล้ว +1

    I love the format of these videos. Looking forward to more and to the newsletters too!

  • @dansokolsky3963
    @dansokolsky3963 ปีที่แล้ว +2

    We need so much more of this.

  • @adamyatripathi2743
    @adamyatripathi2743 ปีที่แล้ว +1

    My head exploded with the DMA. I had not idea! Great learning! :)

  • @parthsarthisharma4163
    @parthsarthisharma4163 ปีที่แล้ว +1

    Crisp yet complete info. Good content. Thank You.

  • @codygaurav6384
    @codygaurav6384 ปีที่แล้ว +1

    concise and crisp clear... Thanks for making such amazing and valuable videos.

  • @StephenGillie
    @StephenGillie ปีที่แล้ว +1

    This helps to explain why the sequential read speed of HDDs is on the AWS Cloud Solutions Architect study guides.

  • @JisKriker
    @JisKriker ปีที่แล้ว +1

    wow. No BS, only content! Thank you!

  • @John-jd2tu
    @John-jd2tu ปีที่แล้ว +1

    Very simple and clear! Thank you!

  • @suman14san
    @suman14san ปีที่แล้ว

    Stunning. It's not abt any topic related to computer science or tech, if anyone teach me anything like this, i will skip everything and learn. Thank you for changing lives of people.

  • @thalathotitharunprabhakar3390
    @thalathotitharunprabhakar3390 ปีที่แล้ว

    Thank you for the wonderful explanation of Kafkas abilities.

  • @dowlathbashag65
    @dowlathbashag65 ปีที่แล้ว +1

    Awesome Explanation about Kafka is amazing...Thank you, Alex

  • @vikram_saha7
    @vikram_saha7 ปีที่แล้ว +1

    wow!! this channel is a goldmine for backend engineer

  • @AnkitMalhotra
    @AnkitMalhotra ปีที่แล้ว +1

    Nice, I definitely learned something new about the Kafka internals today!

  • @lifessummerleaves
    @lifessummerleaves ปีที่แล้ว +2

    Very deep insight! Looking forward to your next videos, please keep going

  • @playniuniu
    @playniuniu ปีที่แล้ว +1

    Great video, explain kafka design so clearly. Thanks very much

  • @Metruzanca
    @Metruzanca ปีที่แล้ว +1

    This is explained so well. I've love to hear you speak more about kafka.
    EDIT: 100% ådding that newsletter to my rss.

  • @lcch12
    @lcch12 ปีที่แล้ว

    Amazing work guys! I'm subscribed to any newsletter and video you make, and it's worth it. Congratulations team 👏👏👏

  • @amaelftah
    @amaelftah ปีที่แล้ว +1

    really this is high quality videos and lovely animations ... thanks a lot for simplifying why kafka is fast

  • @akbarsha03
    @akbarsha03 ปีที่แล้ว +1

    Great work! Easy to understand the concept. Thank you

  • @weiguo6805
    @weiguo6805 ปีที่แล้ว

    Greatest video series with fluenent + clear + intuiative illustration ( master-quality ##) , can not thanku enough!

  • @antirus5481
    @antirus5481 ปีที่แล้ว +1

    Simple and very insightful, I like the lack of music and the use of motion graphics, helps me focus.

  • @NuncNuncNuncNunc
    @NuncNuncNuncNunc ปีที่แล้ว +1

    Very clear explanation. Thank You!

  • @tubenzr
    @tubenzr ปีที่แล้ว +1

    your video is very clear and on-point Sir, thanks a lot 👍👍

  • @siruitao
    @siruitao ปีที่แล้ว +1

    Thanks for the useful instruction!

  • @TBadalov
    @TBadalov 8 หลายเดือนก่อน

    Thank you! Such a great delivery and explanation. Particularly, great choice of aspects to share.

  • @NaqushabNeyazee
    @NaqushabNeyazee ปีที่แล้ว

    Short and Sweet! Excellent video.

  • @gui1221000
    @gui1221000 ปีที่แล้ว +1

    This is so amazing! Straight to the point!

  • @_rd_kocaman
    @_rd_kocaman ปีที่แล้ว

    those minimalistic graphics makes complicated topics easy to ingest. Subscribed!

  • @smoideen
    @smoideen 8 หลายเดือนก่อน

    This was a clear and concise presentation. Thank you so much 👍

  • @DevNarayan
    @DevNarayan ปีที่แล้ว +1

    Amazing details about frequently used software. Lucky to bump into this page. Thanks

  • @sakthikumar4721
    @sakthikumar4721 ปีที่แล้ว +2

    I really appreciate your work. Excellent video. Superbly Articulated. Easy to grab the concepts. Great work. 😍

  • @yaramvenkateswarluchowdary1020
    @yaramvenkateswarluchowdary1020 ปีที่แล้ว

    content is simple and crisp... thank for bringing this to us...

  • @ANSURAJKHADANGA
    @ANSURAJKHADANGA ปีที่แล้ว

    After going through the video and your explanation, I am decided to take a paid subscription in byte byte go! Your explanations are to the point and succinct to understand a topic ! Thank you for the video.

  • @AungBaw
    @AungBaw ปีที่แล้ว

    Short & sweet. Thank you.

  • @TheAceEditor
    @TheAceEditor ปีที่แล้ว

    Essential collection of videos in this channel for a software developer

  • @abdulelahaljeffery6234
    @abdulelahaljeffery6234 ปีที่แล้ว

    WOW, amazing stuff

  • @mirzasohailhussain
    @mirzasohailhussain 11 หลายเดือนก่อน

    Thank u so much!!! I had this question in my mind and got explained by your in a very easy way!!!

  • @nicklaspillay7923
    @nicklaspillay7923 ปีที่แล้ว +1

    This is an amazing video.
    Actually putting it out there - I LIKED AND SUBBED!
    Well deserved for great content 💯

  • @prathibavijayasekaran4173
    @prathibavijayasekaran4173 ปีที่แล้ว

    Very simple with good animation to explain things clearly. Keep publishing these kinds of useful videos.

  • @patrickdee7365
    @patrickdee7365 ปีที่แล้ว +1

    Very cool channel you keep the most important stuff compact, not everyone can do that.

  • @tomok284
    @tomok284 ปีที่แล้ว +1

    Such a good content in just 5 minutes!

  • @vaibhavtyagi3805
    @vaibhavtyagi3805 ปีที่แล้ว

    Not have any doubt , will be trending in top TH-cam channel in system Design world wide, great start.

  • @aayushgupta1186
    @aayushgupta1186 ปีที่แล้ว

    Amazing content! Keep posting such videos, its a great help!!!

  • @jamess5330
    @jamess5330 ปีที่แล้ว

    Thank you for putting up this tutorial! Study vidoes like this and then practice at Meetapro with mock interviews will help you land multiple offers.

  • @DotDager
    @DotDager 9 หลายเดือนก่อน

    First time I actually WANT to subscribe to a newsletter.

  • @betims
    @betims ปีที่แล้ว +1

    Amazing explanation. Thank you sir.

  • @fahmidamiah
    @fahmidamiah ปีที่แล้ว +1

    Really loved this. Thank you.

  • @gopalsv5230
    @gopalsv5230 ปีที่แล้ว

    Nice intro about Kafka, learned quickly, now you got a new subscriber 👍

  • @joross8
    @joross8 ปีที่แล้ว +1

    Awesome video. Looking forward to the next one.

  • @distrologic2925
    @distrologic2925 ปีที่แล้ว

    The "2" on cue was amazing

  • @safiuzkhan5463
    @safiuzkhan5463 ปีที่แล้ว +1

    Very beautifully explained 👌

  • @GughaGSrinivasan
    @GughaGSrinivasan ปีที่แล้ว +5

    ASMR experience :)
    i have subscribed...
    Neat explanations...
    I am not curious about Kafka, but curious about the optimization techniques and strategies they have accomplished which I would like to learn...
    Please do more!

  • @AmanSingh-em7qc
    @AmanSingh-em7qc 6 หลายเดือนก่อน

    Short clear and concise

  • @fokerfakerfuker
    @fokerfakerfuker ปีที่แล้ว

    wow the comments are right. simple and clear... subscribed

  •  ปีที่แล้ว +13

    Great technical explanation. I just want to add that Kafka can be used for much more than just data ingestion sending data from a data source to a data sink. The Apache Kafka open source project also includes Kafka Connect for data integration and Kafka Streams for data processing. Therefore, you can leverage the characteristics explained in this video to build a modern data flow with a single (scalable and reliable) real-time infrastructure instead of combining several different components (like Apache Kafka for ingestion, Apache Camel for data integration, and another stream processing framework like Apache Flink for real-time analytics).

    • @EverydayRoadster
      @EverydayRoadster ปีที่แล้ว +2

      Reliability of Kafka has yet to be proven. Ever so often it does not meet data integration core requirements on reliability, especially in the area of disruption and recovery, where it quickly says GoodBy to “At-most-once” semantics. Don’t get me wrong, Kafka is really great for what it is designed for: efficient streaming in BigData architecture, but that architecture will tolerate a certain fuzziness of data, which pure data integration architecture would not allow for.

  • @lytung1532
    @lytung1532 ปีที่แล้ว +2

    The tutorial is useful. Thanks for your sharing. Could you give more explanations on how Kafka enforces sequentiality characteristic on the disk? Do we need a specialized disks or dedicated settings because as i know a file can be stored fragmently on the disk?

  • @amigochan
    @amigochan ปีที่แล้ว +9

    影片中說明兩個為什麼 Apache Kafka 能夠提供高流量傳輸大量紀錄的特性:
    1. 循序 I/O
    以 C 來說,當使用 fopen() 需要開啟一個檔案為 append 模式,file pointer 會直接在檔案尾端準備以新增方式繼續加入新資料,會比每次加入資料需要移動 Pointer 到特定位置再寫入來的快速。如果用硬碟的循序讀寫與隨機讀寫,會更容易理解。
    在 File-based Database,例如 dBASE, COBOL + ISAM, Paradox,也是直接將新紀錄寫在檔案後方。可以用 PC-Tools 打開檔案觀察 HEX Code 確認。風險在於如果來不及寫入 EOL,沒有順利關閉檔案,就會造成檔案損毀與資料遺失。
    刪除紀錄也只是在記錄上做個標記,並不會真正刪除,需要等到執行 compact database 才會真正刪除。因此我在設計需要確實刪除客戶個人資料時,會以無意義的字串覆蓋,直接刪除其實只是標記,資料還在。
    2. [Zero Copy](en.wikipedia.org/wiki/Zero-copy) 避開將相同資料在不同記憶體區塊再次複製後移動,縮短傳送路徑。例如在提供 DMA 模式情況下,讓系統函數直接將讀取已經被讀入記憶體緩衝區的資料放入網卡 NIC 緩衝區開始傳送,省略 Socket Buffer 路徑。

  • @kailashkolluru2398
    @kailashkolluru2398 ปีที่แล้ว

    Love this explanation!

  • @geehaf
    @geehaf ปีที่แล้ว

    This is excellent. Thank you. Loved the Redis video too.

  • @jakedeng2288
    @jakedeng2288 ปีที่แล้ว

    amazingly done!

  • @tercioae
    @tercioae ปีที่แล้ว

    Great content! Thank you

  • @DominikRoszkowski
    @DominikRoszkowski ปีที่แล้ว +1

    Nice, that was really clear explanation, thanks a lot!

  • @llambduh
    @llambduh 10 หลายเดือนก่อน +2

    While sequential access can be efficient for certain tasks, it also has several downsides:
    Slow Access for Individual Records: If you need to access a specific record in the middle or at the end of a sequentially accessed file or data structure, you would have to traverse through all preceding records. This can be very inefficient and time-consuming, particularly for large datasets.
    Inefficient Updates and Deletions: If a record in a sequentially accessed file needs to be updated or deleted, you often have to rewrite the entire file, or at least all the data following that record, which can be very slow and inefficient.
    Inefficient for Concurrent Access: In situations where multiple users or processes need to access data concurrently, sequential access can be very inefficient and may even lead to data corruption if not handled correctly.
    Lack of Flexibility: Sequential access doesn't allow for as much flexibility in terms of data access patterns. You are essentially restricted to accessing data in the order it was written.
    Space Inefficiency: Sequential files can become space inefficient over time. If records are deleted, the space they occupied often cannot be reused, leading to wasted space.
    Data Structure Overhead: In certain data structures optimized for sequential access, such as linked lists, there can be significant overhead in terms of additional pointers or other structural information that needs to be stored along with the actual data.
    Sequential access is particularly useful and efficient in certain scenarios, including:
    Data Streaming: When data is being streamed from one point to another, such as in audio or video streaming services, sequential access is ideal. Data is read in the order it arrives, and there's usually no need to skip forward or backward.
    Log Files: Log files are typically written and read in a sequential manner. The most recent events are appended to the end of the log, and when reviewing the logs, it's often most useful to read events in the order they occurred.
    Backup and Restore Operations: When performing backup operations or restoring data from backups, the data can be processed sequentially. The backup process involves reading all data from a source and writing it to a backup medium, while restore operations read the data from the backup medium and write it back to the source or a new location.
    Batch Processing: In scenarios where large volumes of data need to be processed in one go, such as overnight processing of transactions, sequential access can be used efficiently.
    Data Warehousing and Data Mining: In data warehousing and mining operations where huge volumes of data are processed, sequential access is often used.
    Sequential Read/Write Media: For certain types of media, such as magnetic tapes, sequential access is the only viable method. You read from or write to the tape in a linear fashion, from one end to the other.
    Zero copy is a technique that reduces CPU usage and increases data processing speed by eliminating unnecessary data copying between user space and kernel space during network communication or file I/O operations. The data to be sent over the network is sent directly from the disk buffer cache to the network buffer without being copied.
    Pros:
    Increased Efficiency: Zero-copy can significantly speed up data transfer rates because it removes the overhead of copying data between user and kernel space.
    Reduced CPU Usage: As there's no need to copy data, zero-copy methods can reduce CPU usage, freeing up resources for other tasks.
    Reduced Memory Usage: Zero-copy techniques can lead to less memory usage because they avoid creating extra copies of data in memory.
    Lower Latency: By avoiding the overhead of data copying, zero-copy can lead to lower latency in network communication or file I/O operations.
    Cons:
    Complexity: Implementing zero-copy can be complex and may require a deep understanding of the operating system and network interfaces. This can increase development time and potentially introduce more bugs.
    Data Security: With zero-copy, the data stays in the kernel buffer and is directly accessible to user space. This could potentially lead to security vulnerabilities if not managed correctly.
    Buffer Availability: Zero-copy can lead to buffers being locked for longer periods, as the same buffer is used for reading data from the disk and sending it over the network. This could potentially impact other tasks that need to use these buffers.
    Non-Contiguous Memory Issues: If data is stored non-contiguously in memory, zero-copy can be challenging to implement effectively.
    The decision to use zero-copy would largely depend on the specific needs of the system and whether the benefits of increased data transfer speed, reduced CPU usage, and lower memory footprint outweigh the increased complexity and potential risks.