How to build and automate a python ETL pipeline with airflow on AWS EC2 | Data Engineering Project

แชร์
ฝัง
  • เผยแพร่เมื่อ 2 มิ.ย. 2024
  • In this data engineering project, we will learn how to build and automate an ETL process that can extract current weather data from open weather map API, transform the data and load the data into an S3 bucket using Apache Airflow. Apache Airflow is an open-source platform used for orchestrating and scheduling workflows of tasks and data pipelines. This project will be entirely carried out on AWS cloud platform.
    We will cover the fundamental concepts of Apache Airflow such as DAG and Operators and I will show you how to install Apache airflow from scratch and schedule your ETL pipeline. I will also show you how to use sensor in your ETL pipeline.
    As this is a hands-on project, I highly encourage you to first watch the video in its entirety without following along so that you can better understand the concepts and the workflows after which you should either try to replicate the example I showed without watching the video but consult the video when you are stuck or you could watch the video again the second time in its entirety while also following along this time.
    Remember the best way to learn is by doing it yourself - Get your hands dirty!
    If you have any questions or comments, ok to ask or leave comments in the comment section below.
    Books I recommend
    1. Grit: The Power of Passion and Perseverance amzn.to/3EZKSgb
    2. Think and Grow Rich!: The Original Version, Restored and Revised: amzn.to/3Q2K68s
    3. The Book on Rental Property Investing: How to Create Wealth With Intelligent Buy and Hold Real Estate Investing: amzn.to/3LLpXRy
    4. How to Invest in Real Estate: The Ultimate Beginner's Guide to Getting Started: amzn.to/48RbuOb
    5. Introducing Python: Modern Computing in Simple Packages amzn.to/3Q4driR
    6. Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter 3rd Edition: amzn.to/3rGF73G
    **************** Commands used in this video ****************
    sudo apt update
    sudo apt install python3-pip
    sudo apt install python3.10-venv
    python3 -m venv airflow_venv
    sudo pip install pandas
    sudo pip install s3fs
    sudo pip install apache-airflow
    airflow standalone
    sudo apt install awscli
    aws configure
    aws sts get-session-token
    **************** USEFUL LINKS ****************
    Extract current weather data from Open Weather Map API using python on AWS EC2: • Extract current weathe...
    How to remotely SSH (connect) Visual Studio Code to AWS EC2: • How to remotely SSH (c...
    PostgreSQL Playlist: • Tutorial 1 - What is D...
    Weather Map API: openweathermap.org/api
    Github Repo: github.com/YemiOla/data_engin...
    Please don’t forget to LIKE, SHARE, COMMENT and SUBSCRIBE to our channel for more AWESOME videos.
    DISCLAIMER: This video and description has affiliate links. This means when you buy through one of these links, we will receive a small commission and this is at no cost to you. This will help support us to continue making awesome and valuable contents for you.
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 241

  • @rex2758
    @rex2758 11 หลายเดือนก่อน +3

    Thanks for taking the time to talk this video out!

  • @sachasmart7139
    @sachasmart7139 11 หลายเดือนก่อน +4

    Really good tutorial. Nicely done. Looking forward to part 2!

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน +1

      Thanks! I'm glad you like it!

  • @mdobaidullahal-faruk3457
    @mdobaidullahal-faruk3457 3 วันที่ผ่านมา

    Thank you very much for making the concepts so easy to understand👌

  • @gyungyoonpark
    @gyungyoonpark 4 หลายเดือนก่อน

    this video is so clear and helpful. there are many airflow courses, but this video goes beyond and helps you "practice" airflow. hats off to the master and look forward to more awesome videos!!!

  • @murilloalves14
    @murilloalves14 10 หลายเดือนก่อน +7

    This was just what I was looking for! Now it's time to apply it on my own projects. Keep the good work! Big thank you from Brazil!

    • @tuplespectra
      @tuplespectra  10 หลายเดือนก่อน

      Thank you! I'm glad you like the video. Always a good idea to apply the learning in another project. Goodluck!

  • @hellowillow_YT
    @hellowillow_YT 5 หลายเดือนก่อน +1

    I've been looking for an ETL project videos that I can follow to learn basic data engineering stuff and finally I found your video! Thank you for this!

    • @tuplespectra
      @tuplespectra  5 หลายเดือนก่อน

      Awesome, thank you! I'm glad it was helpful. Please go ahead to explore other videos to take your skill to the next level.

  • @Zelinity
    @Zelinity 4 หลายเดือนก่อน +7

    This video has the signature of a master teacher. You introduced key concepts in a way that is simple to understand. Thank you for starting from level 1 without any assumptions of what we viewers/learners bring to the subject.

    • @tuplespectra
      @tuplespectra  4 หลายเดือนก่อน

      Thanks so much for this comment. It really means a lot to me. I'm glad you found it valuable.

    • @prabhatgupta6415
      @prabhatgupta6415 4 หลายเดือนก่อน

      bring more on databricks as well end to end please. @@tuplespectra

  • @shwetabhat9981
    @shwetabhat9981 11 หลายเดือนก่อน

    Amazing !! Looking forward to many more

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      I have just released a second part to this where I showed tasks running in parallel. See link here: th-cam.com/video/DKsf88oCPWA/w-d-xo.html

  • @ahmedayodele3299
    @ahmedayodele3299 11 หลายเดือนก่อน

    This tutorial is great. Looking up to more videos. You got a follower.

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน +1

      I'm glad you love it! Thank you! And thanks for subscribing. You can also explore our playlist of Postgresql video series.

  • @ahm_mask5161
    @ahm_mask5161 9 หลายเดือนก่อน

    Your channel is pure gold

    • @tuplespectra
      @tuplespectra  9 หลายเดือนก่อน

      Thanks so much. I'm glad you find our videos valuable. We hope to continue to provide you with more awesome instructional videos.

  • @vaibhavpawar3403
    @vaibhavpawar3403 5 หลายเดือนก่อน +4

    Very detailed and basics tutorial with actual hands-on recorded. No PPT's simply a basic teaching which is very helpful for data engineer.

    • @tuplespectra
      @tuplespectra  5 หลายเดือนก่อน

      Glad it was helpful! Thanks so much for your comment.

  • @nnamdinwafor
    @nnamdinwafor 3 หลายเดือนก่อน

    After watching this video, I knew I had to thank you for this truly awesome video. I have learnt more from this video than from many others out there. You are amazing.

    • @tuplespectra
      @tuplespectra  3 หลายเดือนก่อน

      Great to hear! Thanks so much for this comment. It means a lot to me.

  • @mahmoudferhat8031
    @mahmoudferhat8031 2 หลายเดือนก่อน

    simple and clearly explained, THANKS !!

    • @tuplespectra
      @tuplespectra  2 หลายเดือนก่อน +1

      Thanks for your comment. I'm glad you found the video helpful.

  • @user-mr6gg5gv3o
    @user-mr6gg5gv3o 11 หลายเดือนก่อน +2

    congratulation, good job, part 2, dont forget

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      Second part coming out soon.

  • @hq1n
    @hq1n 3 หลายเดือนก่อน +3

    idk why you are not at least as hype as Zach Wilson. thank you very much for giving out high quality content for free!

    • @tuplespectra
      @tuplespectra  3 หลายเดือนก่อน

      Thanks so much for your comment. I'm glad you found the video helpful.

  • @ndrimenan624
    @ndrimenan624 4 หลายเดือนก่อน +1

    Thanks for this wonderful tuto. It’s time for me to practice now 🙏🏽

    • @tuplespectra
      @tuplespectra  4 หลายเดือนก่อน

      You’re welcome 😊 . Yes! Practice!

  • @flaskwater44
    @flaskwater44 7 หลายเดือนก่อน

    Great tutorial! Thank You.

    • @tuplespectra
      @tuplespectra  7 หลายเดือนก่อน

      Glad it was helpful! Thanks for your comment.

  • @nehalverma1444
    @nehalverma1444 11 หลายเดือนก่อน +1

    This is awesome! Thanks a lot

  • @seth_king_codes
    @seth_king_codes 9 หลายเดือนก่อน +2

    amazing man...would love more airflow/dags/python tutorials and also maybe how i can use with scraping data..cheers!

    • @tuplespectra
      @tuplespectra  9 หลายเดือนก่อน +1

      Thanks! We have more videos on airflow you can explore. We will look into your request as well. Thanks!

  • @Chandu_Art
    @Chandu_Art 6 หลายเดือนก่อน

    Being an aspiring data engineer, this project is really helpful . Thank you so much for this content :)

    • @tuplespectra
      @tuplespectra  5 หลายเดือนก่อน

      You're very welcome! Thanks for your comment.

    • @sumitrawall
      @sumitrawall 4 หลายเดือนก่อน

      can it run in free tier EC2 instance?

  • @BambooAlina
    @BambooAlina 3 หลายเดือนก่อน

    Thank you, this was super helpful!

    • @tuplespectra
      @tuplespectra  2 หลายเดือนก่อน

      You're so welcome!

  • @SanjeevKumar-dr6qj
    @SanjeevKumar-dr6qj 11 หลายเดือนก่อน

    thank you for this tutorial

  • @user-hi3dj6wh7d
    @user-hi3dj6wh7d 3 หลายเดือนก่อน

    i was initially puzzled or worried if i can grasp all but thanks for this video. this helps to dive in to code with airflow

    • @tuplespectra
      @tuplespectra  3 หลายเดือนก่อน

      Glad it helped!

  • @kerryw6361
    @kerryw6361 6 หลายเดือนก่อน +1

    This is really good info! Thank you! One possible area to further advance this video is to upgrade the final task by loading data to an actual database (PostgreSQL for example).

    • @tuplespectra
      @tuplespectra  6 หลายเดือนก่อน

      Thanks so much for your comment. Means a lot to me. I agree that we can add a final task to load the data in a database. I have done that in other videos where we loaded to PostgreSQL and in another video we loaded to AWS redshift and yet in another video we loaded to snowflake. Please see my airflow playlist for several airflow projects to explore. Thanks so much. th-cam.com/video/DKsf88oCPWA/w-d-xo.html

  • @peterkatongole5984
    @peterkatongole5984 11 หลายเดือนก่อน

    Awesome project. Well explained

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      Thank you!

    • @peterkatongole5984
      @peterkatongole5984 11 หลายเดือนก่อน

      If all content creators use this method, showcasing their skills by creating projects, they would inspire so many. Thank you tuplespectra once again.

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      @@peterkatongole5984 Thanks for the comment.

  • @narasa12
    @narasa12 15 วันที่ผ่านมา +1

    Excellent information, thank you so much for posting this video here

    • @tuplespectra
      @tuplespectra  11 วันที่ผ่านมา

      Glad it was helpful!

  • @pspointssara8472
    @pspointssara8472 4 หลายเดือนก่อน

    Excellent presentation. Even though I'm an experienced person still I need to learn a lot from your videos. This reminds me to watch more of your other videos in future. Good work and keep it up.

    • @tuplespectra
      @tuplespectra  4 หลายเดือนก่อน +1

      Glad it was helpful! Thanks for your comment.

  • @user-cg3vz2ug7w
    @user-cg3vz2ug7w 11 หลายเดือนก่อน +1

    Thank you for this amazing, well-explained project on apache airflow. I hope you'll make a tuto on apache superset too.

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      I'm glad you love it! Thank you!

  • @onuhmichael4312
    @onuhmichael4312 5 หลายเดือนก่อน +1

    Helped me to upskill my knowledge, Awesome tutorial keep it up!!!

    • @tuplespectra
      @tuplespectra  5 หลายเดือนก่อน

      Glad it helped!

    • @todayspecial9705
      @todayspecial9705 3 หลายเดือนก่อน

      so how do we make an instance. do we have to pay for this?

  • @ezechukwufidelis5978
    @ezechukwufidelis5978 10 หลายเดือนก่อน

    you're good. I am looking forward to the next video.

    • @tuplespectra
      @tuplespectra  9 หลายเดือนก่อน

      Thank you so much. I'm glad you find our videos valuable.

  • @user-dz8ro7xy9h
    @user-dz8ro7xy9h 4 หลายเดือนก่อน

    Simply… your the best

    • @tuplespectra
      @tuplespectra  4 หลายเดือนก่อน

      Thanks so much for your comment. It made my day.

  • @SonuKumar-fn1gn
    @SonuKumar-fn1gn 10 หลายเดือนก่อน

    Thank you so much for the video. ❤😊

  • @stenardt
    @stenardt 9 หลายเดือนก่อน

    Excellent tutorial

  • @user-ci5fy3vr7d
    @user-ci5fy3vr7d 10 หลายเดือนก่อน

    Awesome explanation

  • @guidysoll
    @guidysoll 7 หลายเดือนก่อน

    Thank you very much for the amazing content. Already subscribed to your channel.

    • @tuplespectra
      @tuplespectra  7 หลายเดือนก่อน

      You are welcome. Awesome! Thank you for the sub!

  • @abhijeetsuryawanshi4064
    @abhijeetsuryawanshi4064 5 หลายเดือนก่อน +1

    Great video 👍👍👍👍👍

  • @pavanparvathanenii4471
    @pavanparvathanenii4471 หลายเดือนก่อน

    Amazing video❤

    • @tuplespectra
      @tuplespectra  หลายเดือนก่อน

      Glad you liked it!!

  • @Oikawa_13
    @Oikawa_13 11 หลายเดือนก่อน

    Thank you so much

  • @devangpatel1341
    @devangpatel1341 11 หลายเดือนก่อน

    You are clear!!

  • @AlmirTvojDrug
    @AlmirTvojDrug 7 หลายเดือนก่อน

    Thank you for this good end-to-end example. However, there are some overheads, e.g. you don't need AWS STS tokens, AWS CLI setup is enough. Also, there is no need to IAM role for EC2. Greetings from Croatia!

  • @Dataengineer-ci8fe
    @Dataengineer-ci8fe 7 หลายเดือนก่อน

    Thank you so much for the video, just have a question, how can we make the airflow scheduler keep running after exit ssh? I noticed that the airflow scheduler stopped running after a while or if you exit the connection interface to ec2.

  • @shanevarnum
    @shanevarnum 11 หลายเดือนก่อน

    Good tutorial

  • @mohamedelbrawy1222
    @mohamedelbrawy1222 5 หลายเดือนก่อน +1

    I am very grateful to see this kind of teaching, and I have searched a lot to see a way to connect airflow to the free instance I found a blog on medium that demonstrates that: ( How to Install Apache Airflow on AWS EC2 Instance?)

    • @tuplespectra
      @tuplespectra  5 หลายเดือนก่อน +1

      Glad it was helpful!

  • @jerbear97
    @jerbear97 8 หลายเดือนก่อน

    excellent vibe, i love it. other tutorial make me sleep

    • @tuplespectra
      @tuplespectra  8 หลายเดือนก่อน

      Thanks for your comment. It really means a lot to me.

  • @obinnaasiegbu2052
    @obinnaasiegbu2052 11 หลายเดือนก่อน

    This is the bomb! Ose!

  • @AL-kn9wd
    @AL-kn9wd 9 หลายเดือนก่อน +2

    without using of AWS access key it will work too, I think you just had to wait few seconds longer. I liked this project and hope to see more!

    • @tuplespectra
      @tuplespectra  9 หลายเดือนก่อน

      Thank you! I'm glad you find the video valuable.

    • @nitropan
      @nitropan 5 วันที่ผ่านมา

      can you tell us how? I also created the role but it does not seem to allot the ObjectPut access.. even with S3FullAccess

  • @MrBenStringer
    @MrBenStringer 2 หลายเดือนก่อน +1

    Been following along with this but used GCP instead as it's got $300 free credits for sign up. Might be worth doing that in future?
    Thanks for the great content. Teaching style is brilliant. I'll definitely be checking out your other videos. All the best man.

  • @BI-Rahul
    @BI-Rahul 11 หลายเดือนก่อน +1

    This video is an absolute game-changer for anyone looking to build and automate Python ETL pipelines with Airflow on AWS EC2!
    Waiting for part 2!

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      Second part coming out soon.

  • @adeyinkaadegbenro9645
    @adeyinkaadegbenro9645 9 หลายเดือนก่อน

    Ola, you dey try, nice one

  • @Vikasptl07
    @Vikasptl07 หลายเดือนก่อน

    Good beginners video. I am personally not a fan of running your processing code and orchestration on same instance. How do you ensure package dependency,manage virtual environments,resource allocation for different workloads.

  • @user-cp5wf1tl7w
    @user-cp5wf1tl7w 10 หลายเดือนก่อน

    Thank you so much, my father

  • @ajtam05
    @ajtam05 2 หลายเดือนก่อน +1

    When choosing the OS for the instance...does it matter what the local machine is? I'm having error issues with trying to call Airflow "airflow standalone" command. There are multiple errors so have to look at each one, but I wanted to see first if selecting the same OS "ubuntu" and other criteria might be the issue?

    • @ajtam05
      @ajtam05 2 หลายเดือนก่อน

      Issue was the storage size. “ec2 triggerer_job_runner.py:576} info - triggerer's async thread was blocked.....". Thought it was a different issue because there were many error msgs (like Kubernetes not installed). But after googling the issue...finally found out it was storage size of the VM. That resolved it.

  • @BI-Rahul
    @BI-Rahul 11 หลายเดือนก่อน +8

    I have followed the exact same steps this weekend but this time i am encountering error when I say airflow standalone. The same exact setps I followed last weekend was working flawlessly. Below is the error I am getting this time, Any help would be appreciated. Error:
    pydantic.errors.PydanticUserError: A non-annotated attribute was detected: `dag_id = `. All model fields require a type annotation; if `dag_id` is not meant to be a field, you may be able to resolve this error by annotating it as a `ClassVar` or updating `model_config['ignored_types']`.

    • @totnguyen3308
      @totnguyen3308 11 หลายเดือนก่อน +2

      I also had the same error as you.

    • @raghav01211
      @raghav01211 11 หลายเดือนก่อน +2

      Same for me....Same issue

    • @bukolasalami9021
      @bukolasalami9021 11 หลายเดือนก่อน +1

      I had the same error

    • @bukolasalami9021
      @bukolasalami9021 11 หลายเดือนก่อน +2

      I was able to get pass the error. All I did was to downgrade from pydantic 2.0 to 1.10.10. I hope it works for you

    • @totnguyen3308
      @totnguyen3308 11 หลายเดือนก่อน

      @@bukolasalami9021 How to downgrade from pydantic 2.0 to 1.10.1

  • @Fordalo
    @Fordalo 7 หลายเดือนก่อน +7

    my guy is 100% nigerian

  • @chiragmadhukar9504
    @chiragmadhukar9504 4 หลายเดือนก่อน

    I was enable to open Airflow on EC2 instance, and establish an SSH connection to my Visual Studio. However, my DAG isn't getting created on Airflow. Can you please help?

  • @manaschauhan2418
    @manaschauhan2418 7 หลายเดือนก่อน

    Amazing explanation, I've tried with Azure VM, if anyone is trying with that installation steps for airflow will be little bit different

  • @JuanCruz-nu4mg
    @JuanCruz-nu4mg 11 หลายเดือนก่อน +3

    I was able to do this just fine, I had 2 issues though, the t2.small didnt have enough memory, it kept freezing up, so I had to use a t2.medium instance, then it worked perfectly!

    • @tuplespectra
      @tuplespectra  11 หลายเดือนก่อน

      I'm glad all went all for you. Thanks for watching the video.

    • @swarupsaha9451
      @swarupsaha9451 10 หลายเดือนก่อน

      is medium availabe in free tier?

    • @JuanCruz-nu4mg
      @JuanCruz-nu4mg 10 หลายเดือนก่อน

      @@swarupsaha9451 unfortunately no

    • @tuplespectra
      @tuplespectra  10 หลายเดือนก่อน +1

      @@swarupsaha9451 No, you will have to pay for using medium, however it is not expensive in as much you did not leave it running extensively.

    • @ritvikrajsingh6228
      @ritvikrajsingh6228 6 หลายเดือนก่อน +2

      t2.small has low ram thus wasn't able to run airflow ,one alternate solution is to use swap space.

  • @zahraelhaddi6980
    @zahraelhaddi6980 5 หลายเดือนก่อน

    thank you so much. i did this on azure but the connection part is different than aws and it didnt work

  • @ritesh_ojha
    @ritesh_ojha 2 หลายเดือนก่อน

    @tuplespectra When i try to login airflow by using instance public ip and port 8080. Unable to connect. I have try all things related to security and all

  • @biloloonguesamuel1940
    @biloloonguesamuel1940 11 หลายเดือนก่อน

    hi i can install panda package in airflow container for docker?

  • @ahmeddadjio2003
    @ahmeddadjio2003 6 หลายเดือนก่อน +1

    great video!!!please can i help me with the installation of airflow because i have a few problems!

    • @tuplespectra
      @tuplespectra  6 หลายเดือนก่อน

      Thanks! What's the problem you got?

  • @seth_king_codes
    @seth_king_codes 9 หลายเดือนก่อน

    at 58:41 you say to save the python file and it should sync to airflow, ive waited several minutes but dont see any the dag appearing
    how do i ensure hte file is saved? or can i see even where it failed to be uploaded to airflow?

    • @tuplespectra
      @tuplespectra  9 หลายเดือนก่อน

      I suggest you look at you code to ensure there are no mistakes or typo after which restart your airflow server. Let me know if this fix your issue. Thanks for reaching out.

    • @seth_king_codes
      @seth_king_codes 9 หลายเดือนก่อน

      ok i didnt restart my airflow server, sounds like thats where i may have gone wrong@@tuplespectra

  • @DEEKSHITH-bw6bo
    @DEEKSHITH-bw6bo 8 หลายเดือนก่อน

    I am getting the error at the last step, i.e to store the csv file in the s3 bucket. can you please help me with that
    the log file shows like-- ERROR - Failed to execute job 42 for task transform_load_weather_data ([Errno 22] The provided token is malformed or otherwise invalid.; 54783)

  • @ThanmayiR
    @ThanmayiR 9 หลายเดือนก่อน

    Access key for openweather is not working when integrated with airflow, but it is working fine if used without airflow. Any reason why this could be happening?

    • @tuplespectra
      @tuplespectra  8 หลายเดือนก่อน

      I'm guessing you are probably making some mistakes. Could you watch the video again and follow it step-by-step? That way you might be able to see what the issue is.

  • @vaibhavverma1340
    @vaibhavverma1340 10 หลายเดือนก่อน

    It was excellent sessions no doubt, can you please tell us if we want to take multiple country's like, India, America, France, ... etc then how to do that in that case ??????

    • @tuplespectra
      @tuplespectra  10 หลายเดือนก่อน +1

      Check out this video where I showed how to extract for multiple cities th-cam.com/video/ocFzNmgYW9o/w-d-xo.html

    • @vaibhavverma1340
      @vaibhavverma1340 10 หลายเดือนก่อน

      @@tuplespectra Thank you so much noticing out my concern :)

  • @jyothikasiri4008
    @jyothikasiri4008 5 หลายเดือนก่อน

    When import DAG from airflow it shows Module airflow not found can you please guide?

  • @ikennaochei3027
    @ikennaochei3027 11 หลายเดือนก่อน

    Great tutorial. However can you show how not to hard code the critical information like key id, secret key and the rest?

    • @tuplespectra
      @tuplespectra  10 หลายเดือนก่อน

      Thank you! We will explore this in another video. Noted!

  • @Mehtre108
    @Mehtre108 3 หลายเดือนก่อน

    Hello sir,
    Any end to end AWS Data Engineer projects. So that we will tell in interview as 2 yoe.

    • @tuplespectra
      @tuplespectra  3 หลายเดือนก่อน

      You can work on my projects, understand them and discuss them during interviews.

  • @user-ue8ut8uu2g
    @user-ue8ut8uu2g 8 หลายเดือนก่อน

    MySQL vs PostgreSQL . Which one should i go for as a begineer

    • @tuplespectra
      @tuplespectra  8 หลายเดือนก่อน

      Actually, you can go for either one as a beginner. But I will say since you are already getting the foundations to SQL from my channel using postgresql, I will say to go with postgresql to master the sql skills.

  • @gyungyoonpark
    @gyungyoonpark 4 หลายเดือนก่อน

    I want to make my airflow run every hour, but it seems my token is expired (the one that I get with aws sts get-session-token). how can i get aws credentials that has no expiration?
    for example, each time i run, how can i run aws sts get-session-token and make the result into python variable?

  • @Rohan-pl1rz
    @Rohan-pl1rz 2 หลายเดือนก่อน

    Hi
    I am getting an error while copying and opening the Running
    Public IPv4 DNS(8080)
    Error: connection timed out
    Please help, Thanks

  • @brayanrobertosanchezfigueroa
    @brayanrobertosanchezfigueroa 8 หลายเดือนก่อน

    if we are work in venv, we cant use sudo

  • @Vilayat_Khan
    @Vilayat_Khan 28 วันที่ผ่านมา

    yay, i managed to finish it! and i have the csv file in s3. thx, u deserve the like lol.

    • @tuplespectra
      @tuplespectra  20 วันที่ผ่านมา

      Nice work! You did it! Keep learning! Keep growing!

  • @ndthdproduction2900
    @ndthdproduction2900 4 หลายเดือนก่อน

    Hello bro can you explain why airflow throw DAG import error: Broken DAG: [/home/ubuntu/airflow/dags/weather_dag.py] Traceback (most recent call last):
    File "", line 241, in _call_with_frames_removed
    File "/home/ubuntu/airflow/dags/weather_dag.py", line 8, in
    import pandas as pd
    ModuleNotFoundError: No module named 'pandas'
    I installed pandas with sudo pip install pandas but it still there. Could you explain this. Thank you so much for your work !

  • @yashmatha6180
    @yashmatha6180 4 หลายเดือนก่อน +1

    I am getting Permission Denied: No AccessKey Presented when i try to run my DAG.
    I have created access keys from my AWS, and have also pasted them in aws_credentials variable. Please Help!!

    • @JayRavalani
      @JayRavalani 3 หลายเดือนก่อน

      did you find a workaround for this error? i am also facing the same issue.

  • @joshuaroberts3987
    @joshuaroberts3987 6 หลายเดือนก่อน

    when i connected my vscode to my EC2 instance i didnt get all those files in my vscode. just the .py file. i didnt get airflow file or any other files? please help!

  • @godswillomonkhodion6393
    @godswillomonkhodion6393 10 หลายเดือนก่อน

    each time i trigger my dag, it does not seem to respond. the border line around each task is always white. my might likely be the problem?
    i also noticed my auto refresh in the graph page does not stay active like yours. it is deactivated even when i try to activate it

    • @tuplespectra
      @tuplespectra  10 หลายเดือนก่อน

      What EC2 instance did you use? micro, small, medium?

    • @godswillomonkhodion6393
      @godswillomonkhodion6393 10 หลายเดือนก่อน

      T3.small 2gig

    • @tuplespectra
      @tuplespectra  10 หลายเดือนก่อน +1

      @@godswillomonkhodion6393 Can you try recreating my project using medium? May be your EC2 was freezing. Try that and let me know what you find. Thanks!

    • @godswillomonkhodion6393
      @godswillomonkhodion6393 10 หลายเดือนก่อน

      @@tuplespectra alright. Will feedback after. Thanks

    • @godswillomonkhodion6393
      @godswillomonkhodion6393 10 หลายเดือนก่อน

      @@tuplespectra yea, this fixed the problem.

  • @mihirgharat7585
    @mihirgharat7585 หลายเดือนก่อน

    my ec2 instance is not loading using a t2 micro instance. could that be the only reason?

    • @tuplespectra
      @tuplespectra  หลายเดือนก่อน

      That's one possible reason. It has happened to me before.

    • @mihirgharat7585
      @mihirgharat7585 หลายเดือนก่อน

      It worked now. But now I’m facing a new issue. My airflow dag is not reflecting on airflow

  • @RaghulS-nl6wx
    @RaghulS-nl6wx 8 วันที่ผ่านมา

    i get my load_data task failed i configured everything right but still get failed for the last task i couldnt figure it out anyone with the same scenario got any soln?

  • @kristiandaclan9236
    @kristiandaclan9236 16 วันที่ผ่านมา

    If you have problems with installing dependencies it is because instead of sudo apt install3.10-venv, replate it to sudo apt install3-venv to get the latest version. Currently, it's at 3.12

    • @JayPatel-yf7ot
      @JayPatel-yf7ot 3 วันที่ผ่านมา

      Invalid operation upon using that.

  • @khaliltrabelsi5831
    @khaliltrabelsi5831 4 หลายเดือนก่อน

    i had an error after lunching airflow standalone can you help me please ?
    SqlAlchemySessionInterface.__init__() missing 6 required positional arguments: 'sequence', 'schema', 'bind_key', 'use_signer', 'permanent', and 'sid_length'
    i tried pip install Flask-Session but nothing , i had the same error in my wsl and i solved it the flask-session

    • @krisminvekariya8341
      @krisminvekariya8341 3 หลายเดือนก่อน

      how did you solve it can you please tell me ?

  • @JayPatel-yf7ot
    @JayPatel-yf7ot 3 วันที่ผ่านมา

    I am getting following error while creating a virtual environment.
    E: Unable to locate package python3.11.9-venv
    E: Couldn't find any package by glob 'python3.11.9-venv'
    If anyone can help then it would be great!!

  • @ishanpatil2017
    @ishanpatil2017 10 หลายเดือนก่อน

    can i add this project in resme?

    • @tuplespectra
      @tuplespectra  10 หลายเดือนก่อน

      Yes absolutely! But make sure you complete the project first and ensure you understand the concepts explained.

  • @user-ey7dp7sh5o
    @user-ey7dp7sh5o 7 หลายเดือนก่อน

    Hi,
    My DAG breaks because of "import pandas as pd". It's giving me below error is airflow:
    Broken DAG: [/home/ubuntu/airflow/dags/weather_api_etl_asp.py] Traceback (most recent call last):
    File "", line 241, in _call_with_frames_removed
    File "/home/ubuntu/airflow/dags/weather_api_etl_asp.py", line 7, in
    import pandas as pd
    ModuleNotFoundError: No module named 'pandas'
    I've pandas installed on my EC2 instance.

    • @tuplespectra
      @tuplespectra  7 หลายเดือนก่อน

      Try sudo pip install pandas if you haven't tried it.

    • @user-ey7dp7sh5o
      @user-ey7dp7sh5o 7 หลายเดือนก่อน

      @@tuplespectra I tried that too, tried uninstalling and installed again, specifically version compatible with python version. Lately I tried some python coding in jupyter lab/notebook and found same import error. Anyone else faced this weird import error? Not just in this project, but while running any py files? Please shed some light on this

  • @mudasurbasha5742
    @mudasurbasha5742 6 หลายเดือนก่อน

    I'm struck at airflow standalone command after running this command it continuessly kept printing like conversation between webserver and triggered. Please help me out

    • @tuplespectra
      @tuplespectra  5 หลายเดือนก่อน

      I believe it is working fine. You just need to go grab the user name and pw that airflow created for you and enter them on the airflow UI to sign in.

    • @tuplespectra
      @tuplespectra  5 หลายเดือนก่อน

      Let me know what you get.

    • @mudasurbasha5742
      @mudasurbasha5742 5 หลายเดือนก่อน

      @@tuplespectra if you could give me your Gmail I can send you some screenshots and I can explain you clearly within the mail where I am getting problem

    • @mudasurbasha5742
      @mudasurbasha5742 5 หลายเดือนก่อน

      @@tuplespectra to enter username and pw I need to start the airflow right..... By giving the command airflow standalone..... After giving airflow stand alone command the airflow ui is not coming

    • @tuplespectra
      @tuplespectra  5 หลายเดือนก่อน

      @@mudasurbasha5742 tuplespectra@gmail.com

  • @vrushalip1110
    @vrushalip1110 8 หลายเดือนก่อน

    I am trying to do this project but whenever I hit run button in airflow for the first dag (is_weather_api_ready) my SSH in VS code starts reconnecting and airflow just loads doesn't give output. If anyone can help would really appreciated.

    • @tuplespectra
      @tuplespectra  8 หลายเดือนก่อน

      What EC2 instance are you using?

    • @vrushalip1110
      @vrushalip1110 8 หลายเดือนก่อน

      @@tuplespectra Worked out finally !!
      Thanks

    • @vrushalip1110
      @vrushalip1110 8 หลายเดือนก่อน

      This was my first project in Airflow and your explanation was amazing. Good work.

    • @tuplespectra
      @tuplespectra  8 หลายเดือนก่อน

      @@vrushalip1110 Awesome. I'm glad it worked out for you.

    • @tuplespectra
      @tuplespectra  8 หลายเดือนก่อน

      @@vrushalip1110 Great! Good job completing the project.

  • @user-sf4mm8dp3s
    @user-sf4mm8dp3s 5 หลายเดือนก่อน

    Airflow standalone gives an error saying ' No response from Gunicorn master within120 seconds

  • @ashraf950901
    @ashraf950901 9 หลายเดือนก่อน

    Help. I can't remote connect my vscode. It says could not establish connection

    • @tuplespectra
      @tuplespectra  9 หลายเดือนก่อน

      I have a video that explains how to do that from scratch. This is the link and let me know if it works for you. th-cam.com/video/sQQjMnEkGjs/w-d-xo.html

    • @ashraf950901
      @ashraf950901 9 หลายเดือนก่อน

      @@tuplespectra i follow your instruction. then when it come to connect to host. during opening remote it give me Could not establish connection to "airflow_project": The operation timed out. Im using mac btw

    • @tuplespectra
      @tuplespectra  9 หลายเดือนก่อน +1

      @@ashraf950901 I'm not sure if using mac is the issue. Can you make sure that the configurations in your config file is correct? Make sure the IdentityFile points to the absolute path of the .pem file. Also make sure that the host name is the current IPV4 of your current running EC2 instance. Let me know what you find.

  • @syedhashir5014
    @syedhashir5014 11 หลายเดือนก่อน

    great project share ur linkedin profile.

    • @tuplespectra
      @tuplespectra  10 หลายเดือนก่อน

      Thank you! www.linkedin.com/in/opeyemi-olanipekun-ph-d-pmp-certified-six-sigma-black-belt-02735133/

  • @ajtam05
    @ajtam05 2 หลายเดือนก่อน

    Is no one else not able to access the "Public IPv4 DNS” address (using the already created custom port 8080)? Mine times out and says This cite can't be reached.

    • @ajtam05
      @ajtam05 2 หลายเดือนก่อน +1

      Nevermind. I figured it out. I had to add "/login" at the end of the path (after :8080) for the Airflow login page to display.

    • @tuplespectra
      @tuplespectra  2 หลายเดือนก่อน +1

      Good job figuring it out.

    • @ajtam05
      @ajtam05 2 หลายเดือนก่อน

      ​@@tuplespectra Thanks! I'm pretty sure I didn't miss this part, but in the "transform_load_data" function, it's referring to an s3 bucket. I don't believe that was covered (unless it was in a different video). In any case...I'm having PutObject Access Denied issues w/ it. The security is a bit confusing even after looking it up, but is there a video you went over the s3 bucket? Or an easy fix for the access issue? Thanks.

    • @ajtam05
      @ajtam05 2 หลายเดือนก่อน

      ​@@tuplespectra I guess I was able to figure this one out, as well. It looks like I had to change the "Bucket policy" in the S3 bucket "Permissions". The Pub Object access was denied. I could've just put "Action": "s3:PubObject". But I just put wide open "Action": "s3:*".

    • @ajtam05
      @ajtam05 2 หลายเดือนก่อน

      @@tuplespectra Oh, I didn't realize you went over the S3 bucket AFTER going over the Python script. Haha...I guess it works either way.

  • @syedhashir5014
    @syedhashir5014 11 หลายเดือนก่อน

    i want data for every city in a particular country how can I do it

    • @tuplespectra
      @tuplespectra  10 หลายเดือนก่อน

      Check out this video where I showed how to extract for multiple cities th-cam.com/video/ocFzNmgYW9o/w-d-xo.html

    • @syedhashir5014
      @syedhashir5014 10 หลายเดือนก่อน

      @@tuplespectra thanks man

  • @krisminvekariya8341
    @krisminvekariya8341 3 หลายเดือนก่อน

    airflow standalone comaand giving below error: TypeError: SqlAlchemySessionInterface.__init__() missing 6 required positional arguments: 'sequence', 'schema', 'bind_key', 'use_signer', 'permanent', and 'sid_length'

    • @tuplespectra
      @tuplespectra  3 หลายเดือนก่อน

      At what time of the video are you getting this error?

    • @chandananavuluri694
      @chandananavuluri694 3 หลายเดือนก่อน

      I’m also facing the same problem …when i run airflow standalone command

  • @swarupsaha9451
    @swarupsaha9451 10 หลายเดือนก่อน

    its not taking me to the airflow login page with 8080

    • @tuplespectra
      @tuplespectra  10 หลายเดือนก่อน

      Did you start the airflow server (airflow standalone) before trying to load the airflow login page?

  • @sumitrawall
    @sumitrawall 4 หลายเดือนก่อน

    after running 'airflow standalone' in EC2 instance, i am getting this error - TypeError: SqlAlchemySessionInterface.__init__() missing 6 required positional arguments: 'sequence', 'schema', 'bind_key', 'use_signer', 'permanent', and 'sid_length' .......can anyone help me to resolve this

    • @sumitrawall
      @sumitrawall 4 หลายเดือนก่อน +1

      Traceback (most recent call last):
      File "/usr/local/bin/airflow", line 8, in
      sys.exit(main())
      File "/usr/local/lib/python3.10/dist-packages/airflow/__main__.py", line 57, in main
      args.func(args)
      File "/usr/local/lib/python3.10/dist-packages/airflow/cli/cli_config.py", line 49, in command
      return func(*args, **kwargs)
      File "/usr/local/lib/python3.10/dist-packages/airflow/cli/commands/standalone_command.py", line 53, in entrypoint
      StandaloneCommand().run()
      File "/usr/local/lib/python3.10/dist-packages/airflow/utils/providers_configuration_loader.py", line 55, in wrapped_function
      return func(*args, **kwargs)
      File "/usr/local/lib/python3.10/dist-packages/airflow/cli/commands/standalone_command.py", line 69, in run
      self.initialize_database()
      File "/usr/local/lib/python3.10/dist-packages/airflow/cli/commands/standalone_command.py", line 179, in initialize_database
      db.initdb()
      File "/usr/local/lib/python3.10/dist-packages/airflow/utils/session.py", line 79, in wrapper
      return func(*args, session=session, **kwargs)
      File "/usr/local/lib/python3.10/dist-packages/airflow/utils/db.py", line 733, in initdb
      _create_db_from_orm(session=session)
      File "/usr/local/lib/python3.10/dist-packages/airflow/utils/db.py", line 718, in _create_db_from_orm
      _create_flask_session_tbl(engine.url)
      File "/usr/local/lib/python3.10/dist-packages/airflow/utils/db.py", line 711, in _create_flask_session_tbl
      db = _get_flask_db(sql_database_uri)
      File "/usr/local/lib/python3.10/dist-packages/airflow/utils/db.py", line 700, in _get_flask_db
      AirflowDatabaseSessionInterface(app=flask_app, db=db, table="session", key_prefix="")
      TypeError: SqlAlchemySessionInterface.__init__() missing 6 required positional arguments: 'sequence', 'schema', 'bind_key', 'use_signer', 'permanent', and 'sid_length'

    • @gyungyoonpark
      @gyungyoonpark 4 หลายเดือนก่อน

      @@sumitrawall I am having the same problem. have you managed to solve it?

    • @sumitrawall
      @sumitrawall 4 หลายเดือนก่อน

      @@gyungyoonpark No brother😔

    • @josephostrow4876
      @josephostrow4876 4 หลายเดือนก่อน

      having the same issue

    • @khaliltrabelsi5831
      @khaliltrabelsi5831 4 หลายเดือนก่อน

      same error @tupelspectra

  • @Vilayat_Khan
    @Vilayat_Khan 29 วันที่ผ่านมา

    dont ask for likes, i will only like if i can finish and add this to my resume

    • @vinodhm3369
      @vinodhm3369 28 วันที่ผ่านมา

      Very strict haha

  • @ajtam05
    @ajtam05 2 หลายเดือนก่อน +1

    swear I always have issues with every. single. project. on youtube. nobody in the comments has issues for any projects, but me. lol...so nobody is having issues w/ installing airflow? really? smh

    • @user-ut8ln6ct4k
      @user-ut8ln6ct4k 2 หลายเดือนก่อน +1

      bro i feel the same too i guess noone is actually trying

    • @antoniusdaivap7759
      @antoniusdaivap7759 หลายเดือนก่อน

      ​@@user-ut8ln6ct4kissues are big part of IT projects. it's how you can overcome that.

  • @AladinAiWisdom
    @AladinAiWisdom 2 หลายเดือนก่อน +1

    Thanks for the tuto, just as remarque no need for hard coding credentials, since you gave iam role full access and assumed by the EC2. you can write directly in S3 from EC2

    • @tuplespectra
      @tuplespectra  หลายเดือนก่อน

      You're welcome! Thanks for the remark.

  • @Vilayat_Khan
    @Vilayat_Khan 28 วันที่ผ่านมา

    1:38:20 - i dont get why do i need aws configure INSIDE my ec2? why do i need access key when i m already inside my ec2?

  • @sivaji47
    @sivaji47 2 หลายเดือนก่อน +1

    Thank you so much

    • @tuplespectra
      @tuplespectra  2 หลายเดือนก่อน

      You're most welcome