Ask Databricks about Spark Structured Streaming with Simon Whiteley and Ray Zhu.

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 ม.ค. 2025

ความคิดเห็น •

  • @shikokas
    @shikokas ปีที่แล้ว +9

    i think that those session could have been much more beneficial if instead of saying "makes sense" to anything you would ask DBX representatives some hard questions ... also you tend to agree with them on using Auto loader and DLT - but you are not mentioning that those are proprietary solutions that cost more and locks you to Databricks - which is agents their own Open LakeHouse paradigm (help push them to open source them....) other hard questions that are not asked: 1. DLT does not support Managed Schemas/catalog/tables in UC - did you know that ? 2. did you see the endless list of limitations for DLT when working with UC ??3. did you know that on any Schema change in Kafka you need to restart the Spark Stream for the change to take affect ? 4. you haven't addressed to the fact that many times when you do a change in the streaming pipelines all downstream streams are starting to fail (and how to address that...)

    • @eldardragomir6705
      @eldardragomir6705 ปีที่แล้ว

      I agree. I expected more from this interview, instead of that the speech was just about selling UC\DLT for those, who haven't tried it yet.

  • @ByteNinja-YT
    @ByteNinja-YT ปีที่แล้ว

    Hi I still have a question if possible. What is the best way to handle computing lagged values while using structured streaming. Because of the concept of streaming you will have null values in each new batch