Redfin Analytics|python ETL pipeline with airflow|Data Engineering Project|Snowpipe|Snowflake|Part 2

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ก.ย. 2024
  • This is the part 2 of this Redfin Real Estate Data Analytics python ETL data engineering project using Apache Airflow, Snowpipe, snowflake and AWS services.
    In this Redfin Real Estate Data Analytics python ETL data engineering project, you will learn how to connect to the Redfin data center data source to extract real estate data using python after which we will transform the data using pandas and load it into an Amazon S3 bucket. The raw data will also be loaded into an Amazon S3 bucket.
    As soon as the transformed data lands inside the AWS S3 bucket, Snowpipe would be triggered which would automatically run a COPY command to load the transformed data into a snowflake data warehouse table. We would then connect PowerBi to the snowflake data warehouse to then visualize the data to obtain insight.
    Apache airflow would be used to orchestrate and automate this process.
    Apache Airflow is an open-source platform used for orchestrating and scheduling workflows of tasks and data pipelines. We would install the Apache-airflow on our EC2 instance to orchestrate the pipeline.
    Remember the best way to learn data engineering is by doing data engineering - Get your hands dirty!
    If you have any questions or comments, please leave them in the comment section below.
    Please don’t forget to LIKE, SHARE, COMMENT and SUBSCRIBE to our channel for more AWESOME videos.
    Part 1: • Redfin Analytics|pytho...
    *Books I recommend*
    1. Grit: The Power of Passion and Perseverance amzn.to/3EZKSgb
    2. Think and Grow Rich!: The Original Version, Restored and Revised: amzn.to/3Q2K68s
    3. The Book on Rental Property Investing: How to Create Wealth With Intelligent Buy and Hold Real Estate Investing: amzn.to/3LLpXRy
    4. How to Invest in Real Estate: The Ultimate Beginner's Guide to Getting Started: amzn.to/48RbuOb
    5. Introducing Python: Modern Computing in Simple Packages amzn.to/3Q4driR
    6. Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter 3rd Edition: amzn.to/3rGF73G
    **************** Commands used in this video ****************
    Check out my github Repo github.com/Yem...
    **************** USEFUL LINKS ****************
    1. Zillow Data Analytics (RapidAPI) | End-To-End Python ETL Pipeline | Data Engineering Project |Part 1 • Zillow Data Analytics ...
    2. www.redfin.com...
    3. How to Build and Automate loading data from S3 to Snowflake with email notification using airflow • How to Build and Autom...
    4. How to remotely SSH (connect) Visual Studio Code to AWS EC2 • How to remotely SSH (c...
    5. Monitor workflow with slack alert upon DAG failure | Airflow Tutorial • Monitor workflow with ...
    6. How to send out email alert ON RETRY and ON FAILURE in Apache airflow | Airflow Tutorial • How to send out email ...
    7. How to build and automate a python ETL pipeline with airflow on AWS EC2 | Data Engineering Project • How to build and autom...
    8. docs.snowflake...
    9. docs.snowflake...
    10. docs.snowflake...
    11. docs.snowflake...
    12. docs.snowflake...
    13. airflow.apache...
    14. Customer Churn Data Analytics|Data Pipeline using Apache Airflow, Glue, S3, Redshift, PowerBI | Part 3 • Customer Churn Data An...
    15. How to build a pipeline to create table and insert records on snowflake with airflow on AWS EC2 • How to build a pipelin...
    16. docs.snowflake...
    17. PostgreSQL Playlist: • Tutorial 1 - What is D...
    18. Apache Airflow Playlist • How to build and autom...
    19. Download PowerBI www.microsoft....
    DISCLAIMER: This video and description have affiliate links. This means when you buy through one of these links, we will receive a small commission and this is at no cost to you. This will help support us to continue making awesome and valuable contents for you.

ความคิดเห็น • 33

  • @JayRavalani
    @JayRavalani 6 หลายเดือนก่อน

    Hi, absolutely loved this project, will definitely try to use another dataset and replicate this on my own. I have a question though. Which books/resources would you recommend learning more about Airflow and Snowflake? I would really like to increase my grasp over these two domains.

  • @nitindatta6225
    @nitindatta6225 11 หลายเดือนก่อน +2

    Just binge watched both the parts. Very well explained.
    Could you please make a video of doing some Machine Learning on these big datasets.
    Thanks

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      Great suggestion! And thanks for your comment.

  • @Gowthahiradyakil
    @Gowthahiradyakil 11 หลายเดือนก่อน

    I have one request can you please put videos using large file size

  • @DavidDalenzCárcamo-f5g
    @DavidDalenzCárcamo-f5g 8 หลายเดือนก่อน +1

    Thank you for this videos bro!!

    • @tuplespectra
      @tuplespectra  8 หลายเดือนก่อน

      My pleasure!

  • @Gowthahiradyakil
    @Gowthahiradyakil 11 หลายเดือนก่อน

    Do you have any idea like how to unzip a file in S3 using airflow. With many and easy use cases.. please please please brother

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      You can create a task in airflow using bash operator where you would then write an aws command to unzip. But make sure you have awscli installed and configured using the correct credentials. A quick ask from chatgpt gave this command: "aws s3 cp s3://your-bucket/path/to/your/file.zip ./file.zip unzip file.zip -d ./output_directory". You can give it a try.

    • @Gowthahiradyakil
      @Gowthahiradyakil 11 หลายเดือนก่อน

      Yes bro i tried this but not working

    • @Gowthahiradyakil
      @Gowthahiradyakil 11 หลายเดือนก่อน

      I have installed aws cli and it is working for copy file command

    • @Gowthahiradyakil
      @Gowthahiradyakil 11 หลายเดือนก่อน

      But unzip not working

    • @akj3344
      @akj3344 11 หลายเดือนก่อน

      What do you mean not working? What happens when you try the command in his comment? Also this is not messaging. Stop putting multiple comments for a single query and form a coherent sentence.@@Gowthahiradyakil

  • @luiscamilofranco8238
    @luiscamilofranco8238 11 หลายเดือนก่อน

    wow! thank you very much! very nice channel. Well explained with amazing projects.

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      Glad you like them! Thanks so much.

  • @aydemir4988
    @aydemir4988 11 หลายเดือนก่อน

    Another comment. I'm seeing a lot job postings on Microsoft Azure. Do you have a plan to make some projects along that side of the data engineering as well. And can you explain in one another seperate video to explain differences between AWS, GCP and Azure and where Snowflake is sitting in between those ? Thanks

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน +1

      Yes, we will be exploring Microsoft azure very soon. Stay tuned! Thanks.

  • @aydemir4988
    @aydemir4988 11 หลายเดือนก่อน

    another great video, Thanks!! I have request about making a data validation from APi's or database and deploy this data validation on AWS and visualize/report the results. Thanks

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน +1

      Thanks so much for the comment. Request noted.

  • @errrbrrr3821
    @errrbrrr3821 11 หลายเดือนก่อน +1

    if you create courses related to data engineering topics like data warehouse and big data, i would like to join.

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน +1

      Thanks so much for the comment. This means a lot to me. We will work on that. Thanks!

  • @nnnn56166
    @nnnn56166 11 หลายเดือนก่อน

    Thank you for interest project 😁
    PS. No problem when connect with power BI haha 😂

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      Awesome, no issue with PBI this time. Thanks for your comment.

  • @errrbrrr3821
    @errrbrrr3821 11 หลายเดือนก่อน +2

    very informative and useful project!

  • @Gowthahiradyakil
    @Gowthahiradyakil 11 หลายเดือนก่อน +1

    You are legend to indians your project is more informative.

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      Thanks for your comment!

  • @kehindefagbohungbe
    @kehindefagbohungbe 3 หลายเดือนก่อน

    Very detailed!

  • @lesa7p2lmansion
    @lesa7p2lmansion 10 หลายเดือนก่อน

    great showcase :))) can you include dbt as a transformation tool in your future projects

    • @tuplespectra
      @tuplespectra  10 หลายเดือนก่อน

      Thanks so much. Yes, sure. Thanks for your comment.

  • @oyekanemmanuel5636
    @oyekanemmanuel5636 11 หลายเดือนก่อน

    Seconded, I will be very much interested in your data warehousing course sir

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      We will work on that. Thanks so much.