@prakashmudliyar4834, you can refer this link to get requirements file for diff python versions -- github.com/snowflakedb/snowflake-connector-python/tree/main/tested_requirements
Hello Ranadeep , sorry for the late reply , you can refer this video if you want to work with Spark & Snowflake in AWS Glue: th-cam.com/video/7c6kcRKDxgQ/w-d-xo.html And for connecting Snowflake with Python Shell Jobs in AWS Glue , you can refer this -- th-cam.com/video/OJM2IkcIW_o/w-d-xo.html Hope this will be helpful :-)
Can you have references or worked on such activitiy? Worked upon or have some tutorial or code snippet for - Reading data through RestAPI , flattening the json and then loading into snowflake. And will be using AWS lambda and AWS secret manager to store the key and password. This side is new to me with AWS . Any help will really help me
Hello abhishek kumar gupta, yes , have uploaded some videos on the scenario which you explained ... there are 2 videos which you can go through and then you can build the pipeline -- Step 1: ------------ Reading data through RestAPI , for this you will be using AWS lambda and AWS secret manager to store the key and password-- reference video (I used weather api to pull weather data and stored the api key in secret manager , so from this video you will have a clear idea on Rest API Call , Secret Manager integration with AWS Lambda etc)-- th-cam.com/video/xa2D4Hgjd9g/w-d-xo.html&feature=shares Step 2: ------------ flattening the json and then loading into snowflake-- You can do this using Lambda code and using snowflake connector write in a table ...but we can leverage the power of schema on read of Snowflake and then for flattening also we can use snowflake , you can check this -- th-cam.com/video/ON-PU_buvFU/w-d-xo.html&feature=shares Hope this will be helpful.. Happy Learning
Tried alot but continuously getting this error --> [ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': /lib64/libc.so.6: version `GLIBC_2.28' not found (required by /var/task/cryptography/hazmat/bindings/_rust.abi3.so) Traceback (most recent call last):
Hello Shubhi , actually good question ... Regarding the lambda timeout (15min) -- see copy into command is powerful one , it can load good volume too...but there is lambda's time constraint , for that , you can try the below permutation , combination -- copy into command in parallel fashion -- interworks.com/blog/2020/03/04/zero-to-snowflake-multi-threaded-bulk-loading-with-python/ Second option -- AWS Glue , all you need to do is creating a trigger from Lambda to run Glue , how to connect Glue and Snowflake you can get from here -- th-cam.com/video/7c6kcRKDxgQ/w-d-xo.html Third Option -- If you want to process very big data based on this kind of Lambda Trigger , you can go with Transient EMR --th-cam.com/video/ETO_FFhzNic/w-d-xo.html And for s3 GB limitation -- you can apply partitioning and split you data in multiple s3 and try 😊 Hope this will be helpful! Happy Learning :-)
@@KnowledgeAmplifier1 hi sir, how are you passing 1.csv and 2.csv from code. Like we need to mention in the event right but you didn't show that part. Can you please share that
Hi team, I'm getting this error: { "errorMessage": "'Records'", "errorType": "KeyError", "stackTrace": [ " File \"/var/task/lambda_function.py\", line 33, in lambda_handler for record in event['Records']: " ] } can you fix it?
Hello Gershom NC , that code basically tell you due to which filae , the event was created or lambda got triggered ...currently in this pipeline , that is not required ..you can remove that
Hi I am unable to implement Auto ingestion through Lamdba function. Below are the errors. [ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named '_cffi_backend' Traceback (most recent call last): [ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named 'lambda_function' Traceback (most recent call last): I followed the steps as shown above. My python version is 3.10 Below are the steps I followed: Step1: import snowflake.connector as sf def run_query(conn, query): cursor = conn.cursor() cursor.execute(query) cursor.close() def lambda_handler(event, context): s3_file_key = event['Records'][0]['s3']['object']['key'] user="aditya4u" password="" account="" database="DEMO" warehouse="COMPUTE_WH" schema="PUBLIC" role="ACCOUNTADMIN" conn=sf.connect(user=user,password=password,account=account) statement_1='use warehouse '+warehouse statement3="use database "+database statement4="use role "+role run_query(conn,statement_1) run_query(conn,statement3) run_query(conn,statement4) sql_query = "copy into demo.PUBLIC.HEALTHCARE_CSV from @demo.PUBLIC.snow_simple FILE_FORMAT=(FORMAT_NAME=my_csv_format)" run_query(conn, sql_query) Step2: I saved the file as "lambda_function.py" Step3: I copied the file into Deployment_zip folder and ziped the files. Step4: In Lambda s3 bucket I uploaded the zip file.
It is only compatible with python version 3.8, see that you install correct version of snowflake connector for your correct python version, also better option will be if you create a layer with those dependencies and code in code tab on aws lamda
Hi I am unable to implement Auto ingestion through Lamdba function. Below are the errors. [ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named '_cffi_backend' Traceback (most recent call last): [ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named 'lambda_function' Traceback (most recent call last): I followed the steps as shown above. My python version is 3.10 Below are the steps I followed: Step1: import snowflake.connector as sf def run_query(conn, query): cursor = conn.cursor() cursor.execute(query) cursor.close() def lambda_handler(event, context): s3_file_key = event['Records'][0]['s3']['object']['key'] user="aditya4u" password="" account="" database="DEMO" warehouse="COMPUTE_WH" schema="PUBLIC" role="ACCOUNTADMIN" conn=sf.connect(user=user,password=password,account=account) statement_1='use warehouse '+warehouse statement3="use database "+database statement4="use role "+role run_query(conn,statement_1) run_query(conn,statement3) run_query(conn,statement4) sql_query = "copy into demo.PUBLIC.HEALTHCARE_CSV from @demo.PUBLIC.snow_simple FILE_FORMAT=(FORMAT_NAME=my_csv_format)" run_query(conn, sql_query) Step2: I saved the file as "lambda_function.py" Step3: I copied the file into Deployment_zip folder and ziped the files. Step4: In Lambda s3 bucket I uploaded the zip file.
Hello eedris, different modules of Python are specific to some specific python version , so if you use same versions of the snowflake and other dependent packages mentioned in the video for some other python version env , then the code will not work ,that might be the reason of your error ..
It is only compatible with python version 3.8, see that you install correct version of snowflake connector for your correct python version, also better option will be if you create a layer with those dependencies and code in code tab on aws lamda
How to get Req file for Python version 3.10?
@prakashmudliyar4834, you can refer this link to get requirements file for diff python versions -- github.com/snowflakedb/snowflake-connector-python/tree/main/tested_requirements
Can you do it for dynatrace
Its very helpful, is there any video which has Glue ETL to snowflake ?
Hello Ranadeep , sorry for the late reply , you can refer this video if you want to work with Spark & Snowflake in AWS Glue:
th-cam.com/video/7c6kcRKDxgQ/w-d-xo.html
And for connecting Snowflake with Python Shell Jobs in AWS Glue , you can refer this --
th-cam.com/video/OJM2IkcIW_o/w-d-xo.html
Hope this will be helpful :-)
Can you have references or worked on such activitiy?
Worked upon or have some tutorial or code snippet for -
Reading data through RestAPI , flattening the json and then loading into snowflake.
And will be using AWS lambda and AWS secret manager to store the key and password.
This side is new to me with AWS . Any help will really help me
Hello abhishek kumar gupta, yes , have uploaded some videos on the scenario which you explained ... there are 2 videos which you can go through and then you can build the pipeline --
Step 1:
------------
Reading data through RestAPI , for this you will be using AWS lambda and AWS secret manager to store the key and password-- reference video (I used weather api to pull weather data and stored the api key in secret manager , so from this video you will have a clear idea on Rest API Call , Secret Manager integration with AWS Lambda etc)-- th-cam.com/video/xa2D4Hgjd9g/w-d-xo.html&feature=shares
Step 2:
------------
flattening the json and then loading into snowflake-- You can do this using Lambda code and using snowflake connector write in a table ...but we can leverage the power of schema on read of Snowflake and then for flattening also we can use snowflake , you can check this -- th-cam.com/video/ON-PU_buvFU/w-d-xo.html&feature=shares
Hope this will be helpful.. Happy Learning
Tried alot but continuously getting this error --> [ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': /lib64/libc.so.6: version `GLIBC_2.28' not found (required by /var/task/cryptography/hazmat/bindings/_rust.abi3.so) Traceback (most recent call last):
Hey I am getting this same exact error. Were you able to figure it out?
@@gtrace1910 No I guess we have to use snowpipe now...this video is outdated
@@prakashmudliyar4834 other option is REST api
good one
Thank You Keshava Mugulur Srinivas Iyengar! Happy Learning :-)
will this work for billions of data (20GB) ? how to handle lambda timeout (15min) and s3 5GB limitation
Hello Shubhi , actually good question ...
Regarding the lambda timeout (15min) --
see copy into command is powerful one , it can load good volume too...but there is lambda's time constraint , for that , you can try the below permutation , combination --
copy into command in parallel fashion -- interworks.com/blog/2020/03/04/zero-to-snowflake-multi-threaded-bulk-loading-with-python/
Second option -- AWS Glue , all you need to do is creating a trigger from Lambda to run Glue , how to connect Glue and Snowflake you can get from here -- th-cam.com/video/7c6kcRKDxgQ/w-d-xo.html
Third Option -- If you want to process very big data based on this kind of Lambda Trigger , you can go with Transient EMR --th-cam.com/video/ETO_FFhzNic/w-d-xo.html
And for s3 GB limitation --
you can apply partitioning and split you data in multiple s3 and try 😊
Hope this will be helpful!
Happy Learning :-)
@@KnowledgeAmplifier1 thank you so much for responding so fast. Will try this definitely and let you know
It was very helpful. Thanks alot.
Glad to hear that it was helpful Rupam Pathak! Happy Learning :-)
@@KnowledgeAmplifier1 hi sir, how are you passing 1.csv and 2.csv from code. Like we need to mention in the event right but you didn't show that part. Can you please share that
Hi team, I'm getting this error: {
"errorMessage": "'Records'",
"errorType": "KeyError",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 33, in lambda_handler
for record in event['Records']:
"
]
}
can you fix it?
Hi Team,
What is the purpose of below line. Please advise.
s3_file_key = event['Records'][0]['s3']['object']['key'];
Hello Gershom NC , that code basically tell you due to which filae , the event was created or lambda got triggered ...currently in this pipeline , that is not required ..you can remove that
Hi
I am unable to implement Auto ingestion through Lamdba function.
Below are the errors.
[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named '_cffi_backend'
Traceback (most recent call last):
[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named 'lambda_function'
Traceback (most recent call last):
I followed the steps as shown above. My python version is 3.10
Below are the steps I followed:
Step1:
import snowflake.connector as sf
def run_query(conn, query):
cursor = conn.cursor()
cursor.execute(query)
cursor.close()
def lambda_handler(event, context):
s3_file_key = event['Records'][0]['s3']['object']['key']
user="aditya4u"
password=""
account=""
database="DEMO"
warehouse="COMPUTE_WH"
schema="PUBLIC"
role="ACCOUNTADMIN"
conn=sf.connect(user=user,password=password,account=account)
statement_1='use warehouse '+warehouse
statement3="use database "+database
statement4="use role "+role
run_query(conn,statement_1)
run_query(conn,statement3)
run_query(conn,statement4)
sql_query = "copy into demo.PUBLIC.HEALTHCARE_CSV from @demo.PUBLIC.snow_simple FILE_FORMAT=(FORMAT_NAME=my_csv_format)"
run_query(conn, sql_query)
Step2:
I saved the file as "lambda_function.py"
Step3: I copied the file into Deployment_zip folder and ziped the files.
Step4: In Lambda s3 bucket I uploaded the zip file.
It is only compatible with python version 3.8, see that you install correct version of snowflake connector for your correct python version, also better option will be if you create a layer with those dependencies and code in code tab on aws lamda
what is the other guy in the background doing?
Hi
I am unable to implement Auto ingestion through Lamdba function.
Below are the errors.
[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named '_cffi_backend'
Traceback (most recent call last):
[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named 'lambda_function'
Traceback (most recent call last):
I followed the steps as shown above. My python version is 3.10
Below are the steps I followed:
Step1:
import snowflake.connector as sf
def run_query(conn, query):
cursor = conn.cursor()
cursor.execute(query)
cursor.close()
def lambda_handler(event, context):
s3_file_key = event['Records'][0]['s3']['object']['key']
user="aditya4u"
password=""
account=""
database="DEMO"
warehouse="COMPUTE_WH"
schema="PUBLIC"
role="ACCOUNTADMIN"
conn=sf.connect(user=user,password=password,account=account)
statement_1='use warehouse '+warehouse
statement3="use database "+database
statement4="use role "+role
run_query(conn,statement_1)
run_query(conn,statement3)
run_query(conn,statement4)
sql_query = "copy into demo.PUBLIC.HEALTHCARE_CSV from @demo.PUBLIC.snow_simple FILE_FORMAT=(FORMAT_NAME=my_csv_format)"
run_query(conn, sql_query)
Step2:
I saved the file as "lambda_function.py"
Step3: I copied the file into Deployment_zip folder and ziped the files.
Step4: In Lambda s3 bucket I uploaded the zip file.
Hello eedris, different modules of Python are specific to some specific python version , so if you use same versions of the snowflake and other dependent packages mentioned in the video for some other python version env , then the code will not work ,that might be the reason of your error ..
It is only compatible with python version 3.8, see that you install correct version of snowflake connector for your correct python version, also better option will be if you create a layer with those dependencies and code in code tab on aws lamda