The new modern data stack Airbyte Airflow DBT

แชร์
ฝัง
  • เผยแพร่เมื่อ 3 ม.ค. 2025

ความคิดเห็น • 14

  • @memento9979
    @memento9979 2 ปีที่แล้ว +1

    10:25 any chance we can get a link to that github repo? I'd like to test out this stack with DBT. Thanks in advance ;)

  • @navinsai5726
    @navinsai5726 2 ปีที่แล้ว +7

    It would have been better if demo included how to move data pipelines between different environments like dev/qa/production...most of the videos are just focused on showing sample product demo but in real life ...there are lot of things to consider like CI/CD between environments, managing data pipelines, set up different environments, project structure and folder structure etc.

    • @Rene-tu3fc
      @Rene-tu3fc 2 ปีที่แล้ว +1

      agree. so many data talks are essentially Hello, Word! examples which don't fully display the actual problems solved. And 18 minutes is too short for anything more than that, sadly.

  • @ZachRenwickData
    @ZachRenwickData 3 ปีที่แล้ว +1

    Very cool! Can't wait to try out on a project

  • @ariyokabir2285
    @ariyokabir2285 3 ปีที่แล้ว +2

    A nice tools to start with, but there are lot of modifications that needs to be done
    1. changing of date column to string data type at the destination end.
    2. Handling of updated column
    3. Naming each connections

  • @RealityIsNot
    @RealityIsNot 3 ปีที่แล้ว

    Loving this already!!

  • @amjds1341
    @amjds1341 2 ปีที่แล้ว +3

    Link to github for airbyte?

  • @emrea6799
    @emrea6799 2 ปีที่แล้ว +2

    the "behind the scenes" Airflow configuration would have been the most interesting part from the talk. But thanks for sharing..

  • @datasleek7950
    @datasleek7950 2 ปีที่แล้ว

    Bravo Michel

  • @fredt3727
    @fredt3727 2 ปีที่แล้ว +6

    In reality, over the last 20 years, ETL wasn't really ETL anymore.
    This is because many data & BI team were extracting their data into a transactional ODS first, and subsequently agregating & transforming that into datamarts.
    So in effect, many 'traditional' data pipelines in the last 20 years or so, were already ELT, whereby the biggest chunk of the transformations were performed downstream from the ODS, not upstream to that (in situations where the ODS was a physical representation of the source - which is the easiest method to implement).
    Granted, with cloud based datawarehouses, and the need to integrate on-premise data with cloud based SAAS application data, decoupling the extractions from the transformations, has become inevitable due to the explosion of data and the fact it is often based in different data centres and sometimes regions.
    However pragmatic design practices which followed an ODS-first designed pattern, were often already advocating that.

  • @logulokesh2651
    @logulokesh2651 ปีที่แล้ว

    Why do we need Airflow in your demo ? when scheduling can be done airbyte itself ?

  • @miguelsanto1663
    @miguelsanto1663 2 ปีที่แล้ว

    hello, I try to replicate the example by done, but I can't.
    I use S3 as a datasource and S3 as a destination, but it only lets me replicate the data and not transform and normalize the same data.
    Can you tell me the reason for that?
    thank you

  • @fredt3727
    @fredt3727 2 ปีที่แล้ว +3

    How is extracting data into a bunch of JSON blob objects actually helping anyone? Especially when you have to "normalise" it on top of that to be able to use more SQL native reporting tools? Or even transform it further with DBT?
    I am not seeing any "paradigm" shift here, compared to traditional ETL tools, and the use of a "staging" area in most datawarehouses, when the raw extracted data is stored, prior to normalisation into an ODS layer.
    Not saying the Airbyte technology doesn't have its merit - it is open source and looks very cloud-aware.
    Just pointing out that the so-called paradigm shift between ELT and ETL, doesn't stand under scrutiny.
    Most datawarehouses over the last 20 years were actually already built using the same ELT design pattern that is described here, and most were built with....ETL tools.
    The tool doesn't force you to follow a certain pattern, you can do ELT with....an ETL tool.
    "ETL" is just an acryonym, and a label -> those traditional ETL tool are capable of anything really (which is often their pitfall - too many ways to skin the data-cat).

  • @hasancemreok2597
    @hasancemreok2597 5 หลายเดือนก่อน

    You really don’t want us to use custom transformations on AirByte, you put DBT to video’s title, you put it into the slide as a seperate page but you just use one little sentence about it in whole video.
    There’s nothing about what did you transform? How did you transform? Interesting.
    Btw, the video might be 2 years old but I have feelings quite new.