Great video, but I am having trouble using WSl on my Pc. is there a way to create the zip file with the lambda python code and all its dependencies without WSL?
Hello H ghar, You can create lambda layer using ec2 if you want some alternative of wsl or deployment zip , for that , you can refer this video -- th-cam.com/video/0Q4yV7Hb7Vs/w-d-xo.html Hope this will be helpful! Happy Learning :-)
@@KnowledgeAmplifier1 I did this task using lambda layers but I am getting this error when I test my function: Test Event Name lamdatestevent Response { "errorMessage": "'Records'", "errorType": "KeyError", "stackTrace": [ " File \"/var/task/lambda_function.py\", line 6, in lambda_handler s3_file_key = event['Records'][0]['s3']['object']['key']; " ] } What am I doing wrong?
Could you please help me on How do I split the large file(6 GB) from one S3 while transferring to another S3(Multiples files) and then move to Snowflake. S3(Source having Large file)-> Lamda function(split and move)- > S3(Destination)-> snowpipe->snowflake
Thanks for the Tutorial. Do you know how to handle Updates and deletes in snowflake tables? I know, streams are used to do it, but I can't find an example from real application like from database or aws S3 to Snowflake for handling updates and deletes. I am trying to push application data from Postgres (where data can be inserted, updated and deleted) into Snowflake
Here is the steps. You need snowpipe,stream and task to set it up. 1. Consider a table t1_raw. The table will be uploaded using snowpipe. 2. Create a stream t1_stream on top of table t1_raw. what it will do is, as soon as there is a new record in t1_raw, The stream t1_stream will have those record. for example if t1_raw has 1000 record till today and if u load new 20 record in t1_raw. then those new 20 record will also be in t1_stream 3. Now U have another table t2_modelled. use the value from t1_stream to update the table t2_modelled. And in case u want to automate the process, use Task to update the record in t2_modelled. PseudoCode is bellow. Note the stream has data function. CREATE TASK mytask1 WAREHOUSE = mywh SCHEDULE = '5 minute' WHEN SYSTEM$STREAM_HAS_DATA('t1_stream') AS INSERT INTO t2_modelled(id,name) SELECT id, name FROM t1_stream WHERE METADATA$ACTION = 'INSERT';
Hello Sukumar , you can check this playlist for Data Engineering with AWS & Snowflake -- th-cam.com/play/PLjfRmoYoxpNopPjdACgS5XTfdjyBcuGku.html Hope this will be helpful! Happy Learning :-)
HI I tried below steps to copy the file from one s3 bucket to another s3 bucket but when finally uploading the file in source bucket the file is not getting copied to destination bucket. Step1: Created two S3 buckets. Source_bucket Target_bucket. Step2: Created a role (lamdba based) with "s3 full access" Step3: Created lambda function with below parameters Runtime Python 3.8 Execution Role lamdbas3role (which I created newly) Step4: I created trigger with below parametes Location: s3 Bucket Name: source_bucket Event Type: All objects event type And finally enabled "Recursive Invocation" and then clicked on "Add" Step 5: I clicked on Code and entered below code. import json import boto3 import urllib.parse print('Loading function') s3 = boto3.client('s3') def lambda_handler(event, context): # TODO implement bucket = event['Records'][0]['s3']['bucket']['name'] key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8') target_bucket = 'lambdatargetbucket2022' copy_source = {'Bucket': source_bucket, 'Key': object_key}
try: print ("Using waiter to waiting for object to persist through s3 service") waiter = s3.get_waiter('object_exists') waiter.wait(Bucket=source_bucket, Key=object_key) s3.copy_object(Bucket=target_bucket, Key=object_key, CopySource=copy_source) return response['ContentType'] except Exception as err: print ("Error -"+str(err)) return e Step 6: When I finally save and test the code I am getting below error. { "errorMessage": "'Records'", "errorType": "KeyError", "stackTrace": [ " File \"/var/task/lambda_function.py\", line 11, in lambda_handler bucket = event['Records'][0]['s3']['bucket']['name'] " ] } Function Logs START RequestId: 0e41efd6-9a71-4348-9d09-f63d6fc5723e Version: $LATEST [ERROR] KeyError: 'Records' Traceback (most recent call last): File "/var/task/lambda_function.py", line 11, in lambda_handler bucket = event['Records'][0]['s3']['bucket']['name']END RequestId: 0e41efd6-9a71-4348-9d09-f63d6fc5723e REPORT RequestId: 0e41efd6-9a71-4348-9d09-f63d6fc5723e Duration: 1.97 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 68 MB Init Duration: 405.42 ms And also the final is not copying to another s3 bucket(Target). can you please help me where I did the mistake
Hardcoding the access_key_is and security_access_key_id is a very bad practice. I know this is just a demo. This is ok for a demo, but in actual project this should be avoided. Instead use AWS role and give the necessary permission.
Hello Subhamay , very valid point , I just used for demo , it is always better if we use aws secret manager or kms for storing secret credentials or using IAM for creating external stage . I just shown this for demo as lot of concepts might confuse and deviate from original topic ,if you want to make secure system with KMS , you can refer this video -- th-cam.com/video/mBoxHTa8x-w/w-d-xo.html and for Snowflake stage creation using AWS Assume role , you can refer this video -- th-cam.com/video/Mje7AEpxsLA/w-d-xo.html Happy Learning :-)
This is exactly what I was looking for to complete an assignment. Thank you for the good work. #Stayblessed
Glad it was helpful Observatoire Libre des Banques Africaines! Happy Learning :-)
Very nice. Good you did not give up and got it to work finally.
Yeah thanks paracha3! Happy Learning :-)
very nice demo. keep making good and informative videos.
Thank you nabarun chakraborti for your kind words ! Happy Learning :-)
why do we use boto3.client in the beginning and for the second part it is boto3.resouce. Could you please clarify. Thanks
Excellent Explanation
Thank you nadia nizam! Happy Learning
Great video, but I am having trouble using WSl on my Pc. is there a way to create the zip file with the lambda python code and all its dependencies without WSL?
Hello H ghar, You can create lambda layer using ec2 if you want some alternative of wsl or deployment zip , for that , you can refer this video --
th-cam.com/video/0Q4yV7Hb7Vs/w-d-xo.html
Hope this will be helpful!
Happy Learning :-)
@@KnowledgeAmplifier1 I did this task using lambda layers but I am getting this error when I test my function:
Test Event Name
lamdatestevent
Response
{
"errorMessage": "'Records'",
"errorType": "KeyError",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 6, in lambda_handler
s3_file_key = event['Records'][0]['s3']['object']['key'];
"
]
}
What am I doing wrong?
Could you please help me on How do I split the large file(6 GB) from one S3 while transferring to another S3(Multiples files) and then move to Snowflake.
S3(Source having Large file)-> Lamda function(split and move)- > S3(Destination)-> snowpipe->snowflake
Love you man❤
Could you please explain snowflake architecture using the lambda functions for interview purposes
Thanks for the Tutorial. Do you know how to handle Updates and deletes in snowflake tables? I know, streams are used to do it, but I can't find an example from real application like from database or aws S3 to Snowflake for handling updates and deletes.
I am trying to push application data from Postgres (where data can be inserted, updated and deleted) into Snowflake
Here is the steps.
You need snowpipe,stream and task to set it up.
1. Consider a table t1_raw. The table will be uploaded using snowpipe.
2. Create a stream t1_stream on top of table t1_raw. what it will do is, as soon as there is a new record in t1_raw,
The stream t1_stream will have those record. for example if t1_raw has 1000 record till today and if u load new 20 record in t1_raw.
then those new 20 record will also be in t1_stream
3. Now U have another table t2_modelled. use the value from t1_stream to update the table t2_modelled.
And in case u want to automate the process, use Task to update the record in t2_modelled. PseudoCode is bellow. Note the stream has data function.
CREATE TASK mytask1
WAREHOUSE = mywh
SCHEDULE = '5 minute'
WHEN
SYSTEM$STREAM_HAS_DATA('t1_stream')
AS
INSERT INTO t2_modelled(id,name) SELECT id, name FROM t1_stream WHERE METADATA$ACTION = 'INSERT';
Whats the difference between s3 client and resource?
Hello Ravi K Reddy, you can refer this -- stackoverflow.com/questions/42809096/difference-in-boto3-between-resource-client-and-session
@@KnowledgeAmplifier1 Thank you so much !! This helps
Always you are rock ing. Proud of you.
Thank You Sir :-)
Thanks a lot Bro. Your content is really has lot of stuff many of them get benefited. May I know play list name of AWS and Snowflake related stuff.
Hello Sukumar , you can check this playlist for Data Engineering with AWS & Snowflake -- th-cam.com/play/PLjfRmoYoxpNopPjdACgS5XTfdjyBcuGku.html
Hope this will be helpful! Happy Learning :-)
HI
I tried below steps to copy the file from one s3 bucket to another s3 bucket but when finally uploading the file in source bucket the file is not getting copied to destination bucket.
Step1:
Created two S3 buckets.
Source_bucket
Target_bucket.
Step2:
Created a role (lamdba based) with "s3 full access"
Step3:
Created lambda function with below parameters
Runtime
Python 3.8
Execution Role
lamdbas3role (which I created newly)
Step4:
I created trigger with below parametes
Location: s3
Bucket Name: source_bucket
Event Type: All objects event type
And finally enabled "Recursive Invocation" and then clicked on "Add"
Step 5:
I clicked on Code and entered below code.
import json
import boto3
import urllib.parse
print('Loading function')
s3 = boto3.client('s3')
def lambda_handler(event, context):
# TODO implement
bucket = event['Records'][0]['s3']['bucket']['name']
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
target_bucket = 'lambdatargetbucket2022'
copy_source = {'Bucket': source_bucket, 'Key': object_key}
print ("Source bucket : ", bucket)
print ("Target bucket : ", target_bucket)
print ("Log Stream name: ", context.log_stream_name)
print ("Log Group name: ", context.log_group_name)
print ("Request ID: ", context.aws_request_id)
print ("Mem. limits(MB): ", context.memory_limit_in_mb)
try:
print ("Using waiter to waiting for object to persist through s3 service")
waiter = s3.get_waiter('object_exists')
waiter.wait(Bucket=source_bucket, Key=object_key)
s3.copy_object(Bucket=target_bucket, Key=object_key, CopySource=copy_source)
return response['ContentType']
except Exception as err:
print ("Error -"+str(err))
return e
Step 6:
When I finally save and test the code I am getting below error.
{
"errorMessage": "'Records'",
"errorType": "KeyError",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 11, in lambda_handler
bucket = event['Records'][0]['s3']['bucket']['name']
"
]
}
Function Logs
START RequestId: 0e41efd6-9a71-4348-9d09-f63d6fc5723e Version: $LATEST
[ERROR] KeyError: 'Records'
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 11, in lambda_handler
bucket = event['Records'][0]['s3']['bucket']['name']END RequestId: 0e41efd6-9a71-4348-9d09-f63d6fc5723e
REPORT RequestId: 0e41efd6-9a71-4348-9d09-f63d6fc5723e Duration: 1.97 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 68 MB Init Duration: 405.42 ms
And also the final is not copying to another s3 bucket(Target).
can you please help me where I did the mistake
Thanks a lot !
You are welcome Ferruccio Guicciardi! Happy Learning :-)
Thanks for the amazing tutorials
Glad you like them BRIGHT SPARK! Happy Learning :-)
Sir is there any way we can ask help related to some different concept with you.so will you please provide the email id for same. Thank you.
Snowflake separate in playlist sir it will help for us
Hello Raddy , please check this below link --
doc.clickup.com/37466271/d/h/13qc4z-104/d4346819bd8d510
Tqu sm sir
Hardcoding the access_key_is and security_access_key_id is a very bad practice. I know this is just a demo. This is ok for a demo, but in actual project this should be avoided. Instead use AWS role and give the necessary permission.
Hello Subhamay , very valid point , I just used for demo , it is always better if we use aws secret manager or kms for storing secret credentials or using IAM for creating external stage . I just shown this for demo as lot of concepts might confuse and deviate from original topic ,if you want to make secure system with KMS , you can refer this video --
th-cam.com/video/mBoxHTa8x-w/w-d-xo.html
and for Snowflake stage creation using AWS Assume role , you can refer this video --
th-cam.com/video/Mje7AEpxsLA/w-d-xo.html
Happy Learning :-)
cant laugh when u said im showing this and delete later🤣🤣🤣
😄
Topics are advanced but explanation is like idiotic