Big data in Airflow? Discover the XCom Backends with AWS S3!

แชร์
ฝัง
  • เผยแพร่เมื่อ 5 ก.ย. 2024
  • Big data in Airflow? Discover the XCom Backends!
    👍 Smash the like button to become an Airflow Super Hero!
    ❤️ Subscribe to my channel to become a master of Airflow
    🏆 Take my course : www.udemy.com/... to join the legends of Airflow
    🚨 My Patreon: / marclamberti to support my work and be friend for life
    The materials:
    www.notion.so/...
    How to run Airflow locally with Docker
    • Running Airflow 2.0 wi...
    All you need about XComs:
    marclamberti.c...
    The TaskFlow API or the NEW WAY of creating DAGs:
    • TaskFlow API in Airflo...
    When you begin with Apache Airflow, one of the first question you ask yourself is:
    How to share data between tasks in Airflow?
    And for that, there is a mechanism called XCOMs.
    Sharing data with XComs is simple. You just have to call two methods, xcom_push and xcom_pull. The data you want to share will encapsulated into a XCom object. But there is a problem...
    XComs are limited in size. Depending your metadata database, you won't be able to store the same amount of data.
    To solve this, Airflow 2.0 introduced a new feature...
    The XCom Backends!
    Apache Airflow allows you to define your own XCom backend. Which means, instead of storing your XComs into the metadata database of Airflow, you can store them into any storage system you want.
    In this video, you are going to discover how to set up your own XCom Backend with AWS S3!
    Enjoy!

ความคิดเห็น • 16

  • @darsh_shukla
    @darsh_shukla 3 ปีที่แล้ว +4

    Marc you always surprise me. 🙏

  • @MrIgorbpf
    @MrIgorbpf 3 ปีที่แล้ว +1

    Great video man!!! Thanks a lot! I've been struggling to deploy airflow 2 on AWS ECS. Can you make a video of it? I believe other people will enjoy because deploying airflow is tough!

  • @manqobadlamini2207
    @manqobadlamini2207 ปีที่แล้ว

    Brilliant video as always! Thanks.

  • @AbdurrahmanKocukcu
    @AbdurrahmanKocukcu 3 ปีที่แล้ว +3

    Awesome :)

  • @candyskullxoxo4660
    @candyskullxoxo4660 3 หลายเดือนก่อน

    Hi, i love your vids. Can you show how to integrate minio xcom backend in Airflow running on kubernetes? Do i need to modify the Pod executer?

  • @MrQsam
    @MrQsam 3 ปีที่แล้ว

    very useful! ty!

  • @pavelpetkun5269
    @pavelpetkun5269 2 ปีที่แล้ว

    Merci beaucoup!

  • @sunilpotu6028
    @sunilpotu6028 2 ปีที่แล้ว +1

    How can we use xcom between two SparkOperator's. In spark command I am generating value in python file so want to use in sencond SparkOperator python value?How can push value to xcom in spark submit?

  • @luiztauffer8513
    @luiztauffer8513 2 ปีที่แล้ว

    thanks for the great content!
    wouldn't it be safer to store the secrets as ENV variables and retrieve them when needed?

  • @fahadshoaib8735
    @fahadshoaib8735 ปีที่แล้ว

    Hey man nice video but I have one query when the extract task shares the dataframe into the process task the dataframe becomes None and we are not able to perform any pandas operations on it. What's the use of sharing info between tasks if it not usable.

  • @chetansurwade
    @chetansurwade 2 ปีที่แล้ว

    Low or apparently no documentation for Azure Blob and GCS custom xcoms.

  • @shubham_chourasia
    @shubham_chourasia 2 ปีที่แล้ว

    How to achieve this in AWS MWAA environment?

  • @murugangan6817
    @murugangan6817 2 ปีที่แล้ว

    Xcom with mssql

  • @Arnob_111
    @Arnob_111 ปีที่แล้ว

    You skip some things in your video such as while creating an IAM user when need to create a new policy you opened a new browser tab for IAM and created the policy and then came back to your user and attached the policy. These little things may look unimportant but makes it hard to follow you.

  • @SoumilShah
    @SoumilShah 3 ปีที่แล้ว

    I sent you messages on LinkedIn through email
    Tried contacting you no reply sad to say all you are doing is just for views