Hello MJ Lee, I was explaining that we can trigger the glue job from Lambda based on certain event occurrence if required , if you want to run Glue Job from Lambda trigger , then you can check this video -- th-cam.com/video/1tIM1jBmwD4/w-d-xo.html Hope this will be helpful! Happy Learning :-)
sirji in initial architecture you said glue will read data from s3 and apply some transformation and write it to snowflake , but later in the video you pulled data from snowflake and write back to snowflake and s3 .
Hello Amit Prasad, at 4:15-4:37 , I have mentioned that in this video the focus is integration between AWS Glue (or PySpark) & Snowflake as s3 to lambda and then lambda to glue part already covered in separate video , as the primary focus of this video is Glue & Snowflake , so I explained the possible scenarios around this -- pulling data from snowflake and write back to snowflake & pulling data from snowflake and write to s3. If you want to explore s3 to lambda and then lambda to glue, then you can refer this video--th-cam.com/video/1tIM1jBmwD4/w-d-xo.htmlsi=dYoD7GHeG3hhWAei Hope this answers your doubt , if you have any doubt , please feel free to comment , will try to help as much as possible
Hello bhaiya, I am getting errror following each step still getting error .. "py4j.protocol.py4jjavaerror: an error occurred while calling o90.load snowflake" ??? please help me out
Is it mandatory to have Spark to connect to Snowflake? Can’t we directly access data in Snowflake tables using SQL in AWS Glue’s python program? The reason I am asking this question is Spark is a big data analytics tool and not every application is meant for data analytics. Most business applications are Insert, Update, Select, Delete type SQL based programs. So can I embed these SQLs in AWS Glue’s Python scripts without using Spark in the code?
Hello Later Lname, you asked a very good question , I created this separate video to give the answer of your question -- th-cam.com/video/OJM2IkcIW_o/w-d-xo.html Hope this will be helpful! Happy Learning :-)
can we create reverse integration, i.e. to fetch huge data (80 million rows) from snowflake to S3 without using stage. We have only "read only access to snowflake ?
Hello Swarnadeep Chowdhury, no it's not mandetory , you can use other services where spark can run like emr etc. Here is a reference video -- th-cam.com/video/oJ6TvZu6DqQ/w-d-xo.html Happy Learning
Hello Vikram , if you want to trigger AWS Glue Job whenever some file lands in s3 (s3 to Lambda and then Lambda to AWS Glue Job) , you can refer this video -- th-cam.com/video/1tIM1jBmwD4/w-d-xo.html Hope this will be helpful! Happy Learning :-)
Hello Sir I am trying to perform many spark operations once i read the table ( just not group by ) . I used the same jars but i am getting the following error - "An error occurred while calling o94.load. scala/Product$class" . Do u know using which jar will solve this issue . thanks in advance.
Hello Vikram , snowpipe is used for real-time data ingestion from datalake to snowflake using SQS or SNS kind off services .... AWS Glue you can use for any batch processing purpose , batch ingestion or for transforming your data , you can use AWS Glue / EMR
Hello Anh Do, username is what you use to login in the Snowflake Web console , you might have setup while sign up or your admin team can confirm on this , if using OAuth , then , mostly there will be a dedicated user to connect with Python , PySpark etc. the admin team in your project can confirm on the same ...
Tks a lot brother....very helpful...very good easy, clear explanation.... If I have a need to join 2 tables, can I specify table names as comma separated in "source_table_name" and perform the join in ".option("query","********")", pls help to suggest. Thanks.
Hello Adithya , here I have explained AWS Glue and Snowflake integration and in the below video I have explained s3 , Lambda , Glue integration , you can club these together & customize as per your requirements -- th-cam.com/video/1tIM1jBmwD4/w-d-xo.html Happy Learning :-)
Hello .. this approach is not so useful it seems .. here we are processing the snowflake table and processing in spark and storing the data in snowflake again if I am right.. for we can use snowflake itself.. aws glue is extra cost 😅
Hello PACHAPPAGARI MOHAN VAMSI, yes your are right that this transformations can be done using compute power of Snowflake only , actually , this video fundamentally explains how to integrate Snowflake with Spark in AWS Glue platform , and to explain that I took a dummy transformation , the concept can be used for any other workloads which is not possible by snowflake only , for example , if the data is available in mysql rds (source) , then we can use spark to read the data from mysql and then write in snowflake(destination) , in that case , if we want to use AWS Glue as execution env, this video concepts can be useful for someone in that case ...
Hi Friend, How can I read data from RDS and ingest the same to snowflake using glue. Do you have any example for that, It will be really helpful for me. Thanks.
Exactly what I was looking for. Crisp, clear and to the point!
Thank You IamDocxy😊Happy Learning :-)
@@KnowledgeAmplifier1 hii sir can u pls create one vedio on glue job like it will read data from s3 and load it into snowflake table
sir thank you for this video this video helped me a lot ,your explaination is awesome, please keep doing this we will definitely support you sir
Thanks a lot bro..lot of use cases for snowflake and aws learners…
You are simply awesome, Thank you for the knowledge share!!
Thank you for your kind words , Yadi! Happy Learning
4:10 says the glue job is executed by lambda but there was no lambda setting in the video. do we need to use lambda to call glue job?
Hello MJ Lee, I was explaining that we can trigger the glue job from Lambda based on certain event occurrence if required , if you want to run Glue Job from Lambda trigger , then you can check this video --
th-cam.com/video/1tIM1jBmwD4/w-d-xo.html
Hope this will be helpful! Happy Learning :-)
daarun very good explanation.. one video full clarity
অনুপ্রেরণামূলক মন্তব্যের জন্য আপনাকে ধন্যবাদ Desi Bhasa Main😊হ্যাপি লার্নিং✌
You are a wonder and this is what I was looking for...thanks much
Glad to know the video was helpful to you praveen yadam! Happy Learning :-)
Very well presented and nice job
Thank You Keshava Mugulur Srinivas Iyengar! Happy Learning :-)
That was awesome ! Precise !
Thank You Mahendra Singh! Happy Learning :-)
crystal clear explanation thank you bro
You are welcome!
sirji in initial architecture you said glue will read data from s3 and apply some transformation and write it to snowflake , but later in the video you pulled data from snowflake and write back to snowflake and s3 .
Hello Amit Prasad, at 4:15-4:37 , I have mentioned that in this video the focus is integration between AWS Glue (or PySpark) & Snowflake as s3 to lambda and then lambda to glue part already covered in separate video , as the primary focus of this video is Glue & Snowflake , so I explained the possible scenarios around this -- pulling data from snowflake and write back to snowflake & pulling data from snowflake and write to s3. If you want to explore s3 to lambda and then lambda to glue, then you can refer this video--th-cam.com/video/1tIM1jBmwD4/w-d-xo.htmlsi=dYoD7GHeG3hhWAei Hope this answers your doubt , if you have any doubt , please feel free to comment , will try to help as much as possible
Nice video. please share same for EMR without airflow.
Hello Ali Mir faisal, you can refer this video -- th-cam.com/video/oJ6TvZu6DqQ/w-d-xo.html Happy Learning
can we do same, to read csv data from s3 and write it as a table into snowflake
Hi How do i do this for EMR on EKS . How do I add the jar files in that case ?
Hello bhaiya, I am getting errror following each step still getting error .. "py4j.protocol.py4jjavaerror: an error occurred while calling o90.load snowflake" ??? please help me out
Is it mandatory to have Spark to connect to Snowflake? Can’t we directly access data in Snowflake tables using SQL in AWS Glue’s python program? The reason I am asking this question is Spark is a big data analytics tool and not every application is meant for data analytics. Most business applications are Insert, Update, Select, Delete type SQL based programs. So can I embed these SQLs in AWS Glue’s Python scripts without using Spark in the code?
Hello Later Lname, you asked a very good question , I created this separate video to give the answer of your question --
th-cam.com/video/OJM2IkcIW_o/w-d-xo.html
Hope this will be helpful! Happy Learning :-)
Thanks a lot Bro your video is awesome
Most welcome!
How can we find the compatible version for the jar files with the current spark version? Please reply.
can we create reverse integration, i.e. to fetch huge data (80 million rows) from snowflake to S3 without using stage. We have only "read only access to snowflake ?
Is aws glue mandatory for running spark jobs on Snowflake?
Hello Swarnadeep Chowdhury, no it's not mandetory , you can use other services where spark can run like emr etc. Here is a reference video -- th-cam.com/video/oJ6TvZu6DqQ/w-d-xo.html
Happy Learning
can you share the video link to s3 and Lamdbda trigger
Hello Vikram , if you want to trigger AWS Glue Job whenever some file lands in s3 (s3 to Lambda and then Lambda to AWS Glue Job) , you can refer this video --
th-cam.com/video/1tIM1jBmwD4/w-d-xo.html
Hope this will be helpful! Happy Learning :-)
Hello Sir
I am trying to perform many spark operations once i read the table ( just not group by ) . I used the same jars but i am getting the following error - "An error occurred while calling o94.load. scala/Product$class" . Do u know using which jar will solve this issue . thanks in advance.
one doubt can you please answer when we will go for snowpipe and when we can go for glue ?
Hello Vikram , snowpipe is used for real-time data ingestion from datalake to snowflake using SQS or SNS kind off services .... AWS Glue you can use for any batch processing purpose , batch ingestion or for transforming your data , you can use AWS Glue / EMR
@@KnowledgeAmplifier1 thanks for the quick reply
@@vikinist no problem .. Happy Learning
hello, i am getting connection refused error. any idea what could be the reason
Plz make video to answer what u r doing in snowflake.....
Can you post one video with S3 -> Glue -> RS pipeline (not using pyspark)
may I ask what exactly is the username for snowflake this time? because I don't know where to find the user name
Hello Anh Do, username is what you use to login in the Snowflake Web console , you might have setup while sign up or your admin team can confirm on this , if using OAuth , then , mostly there will be a dedicated user to connect with Python , PySpark etc. the admin team in your project can confirm on the same ...
Tks a lot brother....very helpful...very good easy, clear explanation.... If I have a need to join 2 tables, can I specify table names as comma separated in "source_table_name" and perform the join in ".option("query","********")", pls help to suggest. Thanks.
Thanks a lot bro. Can you also please share the video to load the data from S3 to snowflake by using lambda and glue
Hello Adithya , here I have explained AWS Glue and Snowflake integration and in the below video I have explained s3 , Lambda , Glue integration , you can club these together & customize as per your requirements --
th-cam.com/video/1tIM1jBmwD4/w-d-xo.html
Happy Learning :-)
Awesome. can you make video on how to connect redshift using pyspark in similar way ?
Hello .. this approach is not so useful it seems .. here we are processing the snowflake table and processing in spark and storing the data in snowflake again if I am right.. for we can use snowflake itself.. aws glue is extra cost 😅
Hello PACHAPPAGARI MOHAN VAMSI, yes your are right that this transformations can be done using compute power of Snowflake only , actually , this video fundamentally explains how to integrate Snowflake with Spark in AWS Glue platform , and to explain that I took a dummy transformation , the concept can be used for any other workloads which is not possible by snowflake only , for example , if the data is available in mysql rds (source) , then we can use spark to read the data from mysql and then write in snowflake(destination) , in that case , if we want to use AWS Glue as execution env, this video concepts can be useful for someone in that case ...
@@KnowledgeAmplifier1 👍
Hi Friend, How can I read data from RDS and ingest the same to snowflake using glue. Do you have any example for that, It will be really helpful for me. Thanks.
Hi ,
I am having exactly same requirement. Could you please help with the process if you have achieved the same.
Thank you very much for this video
Please could you do an exemple with Oracle and Python ?
👌👌👌👌👌👌👌👌
Thank You 😄