ETL From AWS S3 to Amazon Redshift with AWS Lambda dynamically.

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ธ.ค. 2024

ความคิดเห็น • 110

  • @NoahZhang-x9i
    @NoahZhang-x9i 7 หลายเดือนก่อน +1

    Is there a redshift severless version of tutorial on this?

    • @cloudquicklabs
      @cloudquicklabs  7 หลายเดือนก่อน

      Thank you for watching my videos.
      Please check my latest video on Redshift , I have used Serverless.

  • @nerellakoumudi1293
    @nerellakoumudi1293 ปีที่แล้ว +2

    Hello how to get that S3 key please clarify me,iam little bit confused

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว +1

      Thank you for watching my videos.
      Those are IAM user key and secret, please check description for shared files.

  • @sindhurraju7733
    @sindhurraju7733 ปีที่แล้ว +1

    Thank you for the video.
    I am facing one issue.
    I am unable to see host,db-name,pwd,table-name in the lambda environmental variables 28:44
    Did i miss anything? Please help.

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      Thank you for watching my videos.
      Did you check the code file that I shared in description it has logic to read env variable and you need set the variables as well.

  • @himanshushukla6520
    @himanshushukla6520 11 หลายเดือนก่อน +1

    The lambda is in a vpc and s3 is outside the vpc so my code is giving timeout error. What to do?

    • @cloudquicklabs
      @cloudquicklabs  11 หลายเดือนก่อน

      Thank you for watching my videos.
      May be you need to check if S3 bucket vpc endpoint interface is enabled or not here and also make sure that AWS lambda is integrated with VPC.

  • @AmolModhekar
    @AmolModhekar 9 หลายเดือนก่อน +1

    How you have generate Aws secrete and access key and how to use in lambda function with role

    • @cloudquicklabs
      @cloudquicklabs  9 หลายเดือนก่อน

      Thank you for watching my videos.

  • @AmolModhekar
    @AmolModhekar 9 หลายเดือนก่อน +1

    How you have create role and Access key and Secrete key an all

    • @cloudquicklabs
      @cloudquicklabs  9 หลายเดือนก่อน

      Thank you for watching my videos.

  • @njt3773
    @njt3773 ปีที่แล้ว +1

    csv file is not uploaded on Github repo ?

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      Thank you for watching my videos.
      In this video csv files would be uploaded directly s3 bucket.

    • @njt3773
      @njt3773 ปีที่แล้ว

      For anyone watching the video, you just have to upload any csv file with 3 columns to the s3 bucket. No special format needed. You're welcome :)

  • @onkarnarayanpure4824
    @onkarnarayanpure4824 ปีที่แล้ว +1

    is this a incremental data load in the table

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      Thank you for watching my videos.
      This will upload the new files data that getting generated in s3 side everytime but not incremental upload.

  • @ajaydulange688
    @ajaydulange688 2 ปีที่แล้ว +1

    got confused between IAM ROLE and user.....can u plz help

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.
      Here IAM role is for AWS Lambda service role which gets invoked during file upload in S3 bucket. And IAM user credentials are for SQL 'COPY' command run to copy data from s3 bucket to Amazon Redshift cluster table. Watch again hope you get this.
      Happy learning.

  • @MrMadmaggot
    @MrMadmaggot ปีที่แล้ว +1

    This works with an s3 but what about an s3 folder??? I mean a folder inside the S3

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      Thank you for watching videos.
      As it is event driven it should work in that case as well. But if there are any error , it could be handled in code. Let me know if you face any errors in these cases.

    • @MrMadmaggot
      @MrMadmaggot ปีที่แล้ว

      @@cloudquicklabs Yes it does work. Cool ETL. Do you have any videos about ETL that are hard to make like IRL jobs

  • @xiaoqizhou7740
    @xiaoqizhou7740 ปีที่แล้ว +1

    I followed the same steps, but seems when I upload a csv to S3, the trigger does not invoke the lambda function. what could be the reason?

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      Thank you for watching my videos.
      Following could be the reasons.
      1. Check if you have have trigger from S3 bucket object creation is added or not
      2. Check if Lambda role has the trust to AWS Lambda service.

    • @xiaoqizhou7740
      @xiaoqizhou7740 ปีที่แล้ว +1

      @@cloudquicklabs I will double check . Thanks much . I also got an error saying psycopg2 does not have attribute connect error while testing the lambda . I am using Python run time 3.7 , I am not sure what caused this issue ?

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      You need to create a Lambda layer, I have attached the zip file of psycopg2 module, please create a Lambda layer for the same

    • @xiaoqizhou7740
      @xiaoqizhou7740 ปีที่แล้ว +1

      @@cloudquicklabs I did create the lambda layer using Python runtime 3.7

    • @xiaoqizhou7740
      @xiaoqizhou7740 ปีที่แล้ว

      @@cloudquicklabs finally I successfully loaded the data from S3 to redshift table, however when printing print(curs.fetchmany(3)), it gives [ERROR] ProgrammingError: no results to fetch, but data is available in redshift table. what could be the reason?

  • @twilightcloudcoderz-tcc3441
    @twilightcloudcoderz-tcc3441 ปีที่แล้ว +1

    Thanks, could make video with AWS redshift serverless

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว +1

      Thank you for watching videos.
      Sure, I shall create a video on it.

    • @twilightcloudcoderz-tcc3441
      @twilightcloudcoderz-tcc3441 ปีที่แล้ว

      @@cloudquicklabs thanks in advance waiting ✋️ 😊

  • @AmolModhekar
    @AmolModhekar 9 หลายเดือนก่อน +1

    Please upload one video on your setup lambda, redshift

    • @cloudquicklabs
      @cloudquicklabs  9 หลายเดือนก่อน

      Thank you for watching my videos. I shall create video here

  • @umangsinghal9320
    @umangsinghal9320 2 ปีที่แล้ว +1

    When uploading a new file the table is not being with the value of new files , can you please help

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.
      Can you please check lambda is getting triggered when you upload the file in S3 bucket.
      There you could figure out the reason for not being uploaded in Redshift cluster. Error log might help me to help you

  • @sujithashanmugasundaram5063
    @sujithashanmugasundaram5063 2 ปีที่แล้ว +4

    Thank you for the video, can you please let me know how can we change the Datatype. In my table , i am having other datatypes where as my CSV file will give only VARCHAR. It will be helpful if you show the demo on table which is having different datatypes other than Varchar

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว +3

      Thank you for watching my videos.
      Sure.. I shall create new video on it asap, stay tuned.

  • @vaibhavverma1340
    @vaibhavverma1340 ปีที่แล้ว +1

    I am getting error while testing the code in lambda function:-
    {
    "errorMessage": "'Records'",
    "errorType": "KeyError",
    "stackTrace": [
    " File \"/var/task/lambda_function\", line 7, in lambda_handler
    for record in event['Records']:
    "
    ]
    }

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว +1

      Thank you for watching my video.
      It dictionary key mismatch issues. First print the event, check the dictionary value and add it accordingly.

    • @vaibhavverma1340
      @vaibhavverma1340 ปีที่แล้ว

      @@cloudquicklabs Print the event :-
      import os
      def lambda_handler(event, context):
      print(event)
      When I print the event from above code I got this :-
      Test Event Name
      test
      Response
      null
      Function Logs
      START RequestId: 753eb791-2312-447a-bf06-3becf8f90926 Version: $LATEST
      {'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}
      END RequestId: 753eb791-2312-447a-bf06-3becf8f90926
      REPORT RequestId: 753eb791-2312-447a-bf06-3becf8f90926 Duration: 1.69 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 36 MB Init Duration: 109.30 ms
      Request ID
      753eb791-2312-447a-bf06-3becf8f90926

  • @meenakrishnan1251
    @meenakrishnan1251 7 หลายเดือนก่อน +1

    i dont see Python 3.7, i selected python 3.8, i get this error [ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named 'psycopg2._psycopg' Traceback (most recent call last):

    • @cloudquicklabs
      @cloudquicklabs  7 หลายเดือนก่อน

      Thank you for watching my videos.
      I am creating a new video on topic where I would use latest Python version, it would be released at next weekend.

  • @dikshahirole246
    @dikshahirole246 2 ปีที่แล้ว +1

    I have tried to implement the same but getting error in Lambda. I think my redshift is not able to connect to the S3. How can I figure out this?

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.
      May I know what is the error that you are seeing here

    • @dikshahirole246
      @dikshahirole246 2 ปีที่แล้ว

      @@cloudquicklabs Thanks for reply...pasting error right here only....after querry....
      [ERROR] InternalError_: S3ServiceException:Access Denied,Status 403,Error AccessDenied,Rid S5RM15AXTZNQ248Z,ExtRid 7K6TdTJUNhNS+fKGY2HgUYjq40HjYsI6Hdes69txVhXESjZkqZM8myV5tQcagVwJh/uvB4564IM=,CanRetry 1
      DETAIL:
      -----------------------------------------------
      error: S3ServiceException:Access Denied,Status 403,Error AccessDenied,Rid S5RM15AXTZNQ248Z,ExtRid 7K6TdTJUNhNS+fKGY2HgUYjq40HjYsI6Hdes69txVhXESjZkqZM8myV5tQcagVwJh/uvB4564IM=,CanRetry 1
      code: 8001

  • @vishalkaushal4311
    @vishalkaushal4311 2 ปีที่แล้ว +1

    what is the acess_secrete and acess_key. Are these IAM ROLE CREDS( if yes then IAM role for what)?

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.
      Those are IAM user acess_secrete and access_key, where required access role is attached.

    • @vishalkaushal4311
      @vishalkaushal4311 2 ปีที่แล้ว

      Sorry to sound like a broken record. I am still very confused.
      Basically,
      If i am reading it correctly, we need to create a IAM user with redshift access and pass the creds for that IAM role to the query string.
      Do we need to supply some special roles to the lambda itself?
      Does the IAM role need to have some permission over lambda?

  • @MrChase543
    @MrChase543 2 ปีที่แล้ว +1

    How to provide access_secrets and access_key

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.
      Here I have used acess_secrete and acess_key as Lambda environment variables.
      You can trying following docs.aws.amazon.com/lambda/latest/dg/configuration-envvars.html

  • @jeivah4925
    @jeivah4925 ปีที่แล้ว

    Unable to import module 'lambda_function': No module named 'psycopg2._psycopg'

    • @jeivah4925
      @jeivah4925 ปีที่แล้ว +1

      what should we do for it?

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      Thank you for watching my videos.
      Please check the video descriptions where I have given the link of 'psycopg' package to create the required Lambda layer and this helps to import the module. I encourage to go through the video again.

  • @dattasaiakash2740
    @dattasaiakash2740 2 ปีที่แล้ว +1

    I am getting error message in cloud watch log events after uploading my CSV file in s3 like "[ERROR] InternalError_: The specified S3 prefix 'Weather+readings+Analysis.csv' does not exist".

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Please enable vpc endpoint for s3 service of type gateway. Then it should be working.
      Because here VPC is private network and S3 become public so it is not able to communicate, I believe adding endpoints will solve the issue here.
      Do let me know if you still facing the issues further

    • @dattasaiakash2740
      @dattasaiakash2740 2 ปีที่แล้ว +1

      @@cloudquicklabs still getting the same error even after adding endpoints

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Okay it looks file name issue - is it possible to change file name to 'WeatherReadingAnalysis.csv' as it looks to S3 url with spaces name is the issues

    • @dattasaiakash2740
      @dattasaiakash2740 2 ปีที่แล้ว +1

      @@cloudquicklabs Thank you Sir error resolved.

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว +1

      Wow.. Happy to know that you issue is resolved.
      Thank you so much for watching my videos.

  • @poulomisenpolo
    @poulomisenpolo ปีที่แล้ว +1

    Thank you for sharing the video, this works. Please let me know how can i ignore the header while using this code also how can i load a tsv i.e Tab separated file using the code. It will be helpful.

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว +1

      Thank you for watching my videos.
      As we are loading data with lambda, you can ignore first row with code tactics.
      To load tsv files first converts to csv files which is easy then you can follow same steps mentioned in this video.
      www.google.com/amp/s/www.geeksforgeeks.org/python-convert-tsv-to-csv-file/amp/

  • @Mstteluguspiritualaudiobooks
    @Mstteluguspiritualaudiobooks 8 หลายเดือนก่อน +1

    Thank you so much for sharing this video

    • @cloudquicklabs
      @cloudquicklabs  8 หลายเดือนก่อน

      Thank you for watching my videos.
      Glad that it helped you.

  • @amanrajput8917
    @amanrajput8917 ปีที่แล้ว +1

    Hi sir,
    First of all, I would like to say thank you for sharing such wonderful tutorial video with the code, great video lecture. Looking forward to see some more videos on AWS Glue, EMR, Athena, EC2 as well.
    I am unable to download the zip file from hit repo, is there anything you can suggest ?

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      Thank you for watching my videos.
      Indeed I shall make more videos on mentioned Services.
      You could clone the repo to down the .zip file.

    • @amanrajput8917
      @amanrajput8917 ปีที่แล้ว

      @@cloudquicklabs thank you sir

  • @shashishekhar8066
    @shashishekhar8066 2 ปีที่แล้ว +1

    Could you please provide the sample data csv file in github repo.

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.
      Yes, Sharing it soon.
      Please stay tuned.

  • @udaykumar8177
    @udaykumar8177 2 ปีที่แล้ว +1

    Thanks for sharing knowledge. Looking forward to see more videos from you. 🙏🙏🙏

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.

  • @HammadKhan-m8u
    @HammadKhan-m8u ปีที่แล้ว +1

    kindly add glue job so that schema also be figure out automatically

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      Thank you for watching my videos.
      I shall create a new video on it soon.

  • @maow35
    @maow35 2 ปีที่แล้ว +1

    thanks for video, i have the following problem when executing the lambda function, I get this error "Unable to import module 'LambdaTest::LambdaTest': No module named 'LambdaTest::LambdaTest"

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.
      I believe you have created the lambda as suggested in video, please try to recreate the Lambda with runtime python 3.7 and also creat the lamba layer using same python.zip folder that I have shared and you must attach same on lamba created.

    • @maow35
      @maow35 2 ปีที่แล้ว +1

      @@cloudquicklabs I have done everything as in the video, but it keeps throwing me the same error that the LambdaTest module is missing

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Try creating the lambda first with 'Author from scratch' option using aws console, then copy paste the code that I have given in repo link.. It should work this way.

    • @maow35
      @maow35 2 ปีที่แล้ว +1

      @@cloudquicklabs find the problem, in the runtime configuration, the controller was wrong, I configured it as in the video and it worked perfect, I only have the vpc part left, thank you very much

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว +1

      Happy that it worked for you now.
      Happy learning ahead.

  • @abhishekchaudhari4070
    @abhishekchaudhari4070 2 ปีที่แล้ว +1

    How to load json file in redshift by using lambda can any one please tell

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.
      In Json file case may be Lambda code has to changed to download, read and upload to Redshift.
      May I know what would be your json file size.

    • @abhishekchaudhari4070
      @abhishekchaudhari4070 2 ปีที่แล้ว

      @@cloudquicklabs less than 100 mb I am working on live stream data

    • @abhishekchaudhari4070
      @abhishekchaudhari4070 2 ปีที่แล้ว +1

      IF u have code please share it

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for coming back on this.
      Code should be very simple in this case I shall create new video on this.. Please stay tuned.

    • @abhishekchaudhari4070
      @abhishekchaudhari4070 2 ปีที่แล้ว

      @@cloudquicklabs share link when it's completed

  • @eugeniosp3
    @eugeniosp3 2 ปีที่แล้ว +1

    Excellent tutorial.

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.
      Happy learning.

  • @robertomoreiradiniz5480
    @robertomoreiradiniz5480 2 ปีที่แล้ว +1

    Thanks!

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.

  • @kush-909
    @kush-909 ปีที่แล้ว +1

    Nice effort. But very unorganized flow. It would have been at best 20 minute video if you would have spoken properly and in the correct chronology.

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      Thank you for watching videos.
      I shall take valuable feedbacks and work on it.

    • @kush-909
      @kush-909 ปีที่แล้ว +1

      @@cloudquicklabs Please don't mind, brother. I know it takes a lot of time to prepare, record and edit. What you are explaining is very resourceful.
      But, I guess, instead of explaining things on the fly, it is better to script so that you can minimize the words and don't jump between topics. This was my experience.
      I wanted to watch and learn but could not maintain that attention span.

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      Thank you very much for this valuable input. I shall adapt these in my next videos.

  • @maybenew7293
    @maybenew7293 ปีที่แล้ว +1

    I CANNOT UNDERSTAND ANYTHING YOU SAY!

    • @cloudquicklabs
      @cloudquicklabs  ปีที่แล้ว

      Thank you for watching videos.
      Apologise that you are not following the video. Could you please watch the video again and may stop - start and watch

    • @maybenew7293
      @maybenew7293 ปีที่แล้ว

      @@cloudquicklabs Eh?

  • @keerthanaapajany2120
    @keerthanaapajany2120 2 ปีที่แล้ว +1

    while I am testing the code in lambda getting the error like
    Test Event Name
    (unsaved) test event
    Response
    {
    "errorMessage": "could not connect to server: No such file or directory
    \tIs the server running locally and accepting
    \tconnections on Unix domain socket \"/tmp/.s.PGSQL.5439\"?
    ",
    "errorType": "OperationalError",
    "stackTrace": [
    " File \"/var/task/lambda_function.py\", line 25, in lambda_handler
    password = password)
    ",
    " File \"/opt/python/psycopg2/__init__.py\", line 126, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
    "
    ]
    }
    what is the problem

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.
      Could you please check if you have enabled VPC endpoint for S3 service on VPC where Redshift cluster is hosted. And please make sure that you have integrated your lamba wit same VPC as well.

    • @keerthanaapajany2120
      @keerthanaapajany2120 2 ปีที่แล้ว

      @@cloudquicklabs May I know how to check and integrate it.

  • @bhaskarsr3013
    @bhaskarsr3013 2 ปีที่แล้ว +1

    I am getting the below error....
    {
    unable to import module 'lambda function'
    }
    START RequestId: 40274256-8381-4eaf-84b9-51d8c833df43 Version: $LATEST
    Unable to import module 'lambda_function': No module named 'psycopg2._psycopg'
    I created the layer as per the instruction I uploaded the python.zip . and selected the run time version as python 3.6
    I am using this for postgrasql

    • @cloudquicklabs
      @cloudquicklabs  2 ปีที่แล้ว

      Thank you for watching my videos.
      Please try use 3.7 or Greater run time versions as 3.6 is already decommissioned from AWS.
      Do follow all steps as shown in video.