Thanks. The video was useful. I have a doubt. Instead of deploying the code in ec2 instance can we use python shell jobs in AWS glue. Because glue jobs can be scheduled to run daily automatically.
Yes Bhujith Madav, you can use Python shell jobs in AWS Glue to process your data instead of deploying the code in an EC2 instance. AWS Glue is a fully managed ETL (Extract, Transform, and Load) service that allows you to easily and efficiently process data at scale. With AWS Glue, you can create and schedule Python shell jobs that can be automatically executed on a daily basis. This allows you to automate your data processing and reduce the overhead of managing your own infrastructure. Additionally, AWS Glue provides many built-in connectors to popular data sources like Amazon S3, Amazon RDS, and Amazon Redshift, making it easy to extract and load data into your desired data store. In summary, AWS Glue is a great option for running Python shell jobs and automating your data processing workflows.
@@abhisekchowdhury8584 Thanks for your suggestion! While using cron expressions in an EC2 instance is definitely an option, I wanted to showcase how airflow can simplify the scheduling and management of complex data pipelines. Plus, airflow offers a lot of features such as task dependencies, error handling, and a user-friendly UI. Nonetheless, I appreciate your input and hope you find the video helpful😊!
Thank you Navej Pathan for the positive feedback! I'm glad you found the share useful. I will definitely consider creating a video on how to capture CDC in Postgres and push it to S3. Stay tuned for more content!
Hello tanakam venkata naresh, I've just made a new video on "Building a Batch Data Pipeline using Airflow, Spark, EMR & Snowflake" which I think you'll find really useful--th-cam.com/video/hK4kPvJawv8/w-d-xo.html Check it out and let me know what you think!
can you share complete code in github link
Hello Ravulapalli Venkata Gurnadham, the Github links are shared in the description box...
Apart from putty and winscp, we can perform this on mac machine by using same or..
Thanks. The video was useful. I have a doubt. Instead of deploying the code in ec2 instance can we use python shell jobs in AWS glue. Because glue jobs can be scheduled to run daily automatically.
Yes Bhujith Madav, you can use Python shell jobs in AWS Glue to process your data instead of deploying the code in an EC2 instance. AWS Glue is a fully managed ETL (Extract, Transform, and Load) service that allows you to easily and efficiently process data at scale.
With AWS Glue, you can create and schedule Python shell jobs that can be automatically executed on a daily basis. This allows you to automate your data processing and reduce the overhead of managing your own infrastructure. Additionally, AWS Glue provides many built-in connectors to popular data sources like Amazon S3, Amazon RDS, and Amazon Redshift, making it easy to extract and load data into your desired data store.
In summary, AWS Glue is a great option for running Python shell jobs and automating your data processing workflows.
Or you can put Python shell in cron tab in the EC2 instance and set daily schedular using cron expression.
@@abhisekchowdhury8584 Thanks for your suggestion! While using cron expressions in an EC2 instance is definitely an option, I wanted to showcase how airflow can simplify the scheduling and management of complex data pipelines. Plus, airflow offers a lot of features such as task dependencies, error handling, and a user-friendly UI. Nonetheless, I appreciate your input and hope you find the video helpful😊!
@@KnowledgeAmplifier1 thankyou
@@abhisekchowdhury8584 thankyou
Great share, keep sharing keep shining 🌟.
Please make video on how to capture CDC in postgres and push it to S3.
Thank you Navej Pathan for the positive feedback! I'm glad you found the share useful. I will definitely consider creating a video on how to capture CDC in Postgres and push it to S3. Stay tuned for more content!
CDC in postgres and also mssql
Instead of using snowflake can you please use spark for processing ..Need more videos involving spark
Hello tanakam venkata naresh, I've just made a new video on "Building a Batch Data Pipeline using Airflow, Spark, EMR & Snowflake" which I think you'll find really useful--th-cam.com/video/hK4kPvJawv8/w-d-xo.html
Check it out and let me know what you think!
News API is down unfortunately. Won't let me make an account