Deploy Llama 2 on AWS SageMaker using DLC (Deep Learning Containers)

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 พ.ย. 2024
  • In this tutorial video, I'll show you how to effortlessly deploy Llama2 large language model on AWS SageMaker using Deep Learning Containers (DLC). We'll walk through each step, from accessing pre-built DLC images to configuring SageMaker for Llama2 deployment, designed to make the process smooth and understandable, whether you're new to Generative AI or experienced in the field.
    AWS SageMaker DLC: github.com/aws...
    AI Anytime GitHub: github.com/AIA...
    #ai #llm #python

ความคิดเห็น • 41

  • @yashsrivastava4878
    @yashsrivastava4878 8 หลายเดือนก่อน +1

    thank you ,
    can you please make a video on how to finetune mistral 7b on aws sagemaker, S3, boto3 (in form of async jobs)

  • @shumon29
    @shumon29 ปีที่แล้ว +2

    I am not able to find the gists. The attached repository has only a LICENSE and README file. Could you please share me the repo or gist links?

  • @ashleymavericks
    @ashleymavericks ปีที่แล้ว +1

    Waiting for GGML quantised model deployments. Btw, thanks for your videos.

    • @AIAnytime
      @AIAnytime  ปีที่แล้ว +2

      Coming soon!

  • @dchuguashvili
    @dchuguashvili ปีที่แล้ว +2

    What is the advantage, if any, of using this approach instead of deploying the llama2 model directly from Sagemaker JumpStart?

    • @Digitalsmb
      @Digitalsmb ปีที่แล้ว

      Would love to know answer to this too

  • @49_jaypandya40
    @49_jaypandya40 7 หลายเดือนก่อน

    the content is amazing

  • @danielmz99
    @danielmz99 ปีที่แล้ว +3

    Hi thanks for your videos. Would it be possible to get a video on GGML models being deployed on SageMaker? It is unclear what requirements it needs. They fact that they are CPU optimized will help adoption as many small businesses can't really afford the $40/day hosting cost of a 5g.2x LLM + running costs if all they need is an LLM which is private. Local deployment might not be an option as if you need a 13b+ model to get a decent outcome takes a GGML to require also significant dedicated hardware. I see private cloud GGML deployments as the perfect compromise for cheap running costs and decent functionality for a very large number of usecases. I think it would be a great video. Thanks for your efforts

    • @AIAnytime
      @AIAnytime  ปีที่แล้ว +3

      On GGML deployment, soon..... Pls stay tuned.

    • @ashleymavericks
      @ashleymavericks ปีที่แล้ว

      I can totally resonate with your viewpoint, I'm exploring similar possibilities for a low cost setup.

    • @ashleymavericks
      @ashleymavericks ปีที่แล้ว +1

      @AIAnytime It would be great if you try to deploy a GGML model on AWS compute instances and the REST API is compatible with OpenAI specifications. (can leverage LocalAI project)

  • @kaarthikandu
    @kaarthikandu ปีที่แล้ว

    Can we use spot instances when deploying the models ? Have you tried ?

    • @AIAnytime
      @AIAnytime  ปีที่แล้ว

      You can but that will be interrupted.

  • @amangrover9343
    @amangrover9343 ปีที่แล้ว

    i am getting error RuntimeError: weight model.layers.0.self_attn.rotary_emb.inv_freq does not exist while using Phind/Phind-CodeLlama-34B-v2 model

  • @rohitleo9712
    @rohitleo9712 7 หลายเดือนก่อน

    Hi can we do this for summarization purpose

  • @sravantipris3544
    @sravantipris3544 6 หลายเดือนก่อน

    is GPU required or can it run on CPU only

  • @user4-j1w
    @user4-j1w ปีที่แล้ว

    So simple.... Thank you

    • @AIAnytime
      @AIAnytime  ปีที่แล้ว

      You are welcome 😊

  • @sohailhosseini2266
    @sohailhosseini2266 ปีที่แล้ว

    Thanks for sharing!

    • @AIAnytime
      @AIAnytime  ปีที่แล้ว

      Thanks for watching!

  • @avijit_barua
    @avijit_barua ปีที่แล้ว

    very helpful video!

  • @mohammadkashif6072
    @mohammadkashif6072 ปีที่แล้ว +1

    What IAM roles to assign for the first time in AWS SageMaker?

    • @AIAnytime
      @AIAnytime  ปีที่แล้ว

      Sagemaker full access
      S3 full access

  • @Ankur-be7dz
    @Ankur-be7dz ปีที่แล้ว

    while we use the hugging face tokens and secret key, does hugging face charge us money? Or its free?

    • @AIAnytime
      @AIAnytime  ปีที่แล้ว +1

      No they don't charge. It's free but they do have an API hit rate limit but for you, it won't be a problem. Feel free to use it. It's free.

  • @efexzium
    @efexzium ปีที่แล้ว

    how can we deactivate this endpoint?

  • @VenkatesanVenkat-fd4hg
    @VenkatesanVenkat-fd4hg ปีที่แล้ว

    Highly appreciated, Thanks for your videos. I hav got an error:
    AWS SageMaker Endpoint Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check see the cloudwatch logs eventhough I hav run the same code in the huggingface deploy for llama 2 7b but falcon 7b runs fine, any help...

    • @AIAnytime
      @AIAnytime  ปีที่แล้ว +1

      Thank you! The issue is gated model... Can you use this model?NousResearch/Llama-2-7b-chat-hf it's same but not gated... This should be deployed fine.

    • @VenkatesanVenkat-fd4hg
      @VenkatesanVenkat-fd4hg ปีที่แล้ว

      @@AIAnytimeThanks for your kind response. I hav deployed successfully 7b today only but 13 b needs the AWS quota...(I found related error). Whether I can try quantized version of 13b without AWS quota problem. Kindly reply...

    • @VenkatesanVenkat-fd4hg
      @VenkatesanVenkat-fd4hg ปีที่แล้ว

      @mydsworld3130 check the cloudwatch logs....

  • @SUMANPAULCHOUDHURY
    @SUMANPAULCHOUDHURY ปีที่แล้ว

    Can you show how to do it in aws EC2 instances?

  • @PrasadPrasad-hi7pl
    @PrasadPrasad-hi7pl ปีที่แล้ว

    Could you please make s tutorial on deploying a chatbot for pdf files using sagemaker. Thank you in advance

    • @AIAnytime
      @AIAnytime  ปีที่แล้ว +2

      Yes, i will use the same deployed model for this use case. This will be my next 2 videos. Next will be lambda function and API gateway and then the chatbot for your knowledge base.

  • @karamjittech
    @karamjittech ปีที่แล้ว

    Awesome video. But how can we fine tune and using RAG approach?

    • @AIAnytime
      @AIAnytime  ปีที่แล้ว +4

      Coming soon..... Will same deployed LLMs for RAG based application

  • @jayasuriyap8748
    @jayasuriyap8748 10 หลายเดือนก่อน

    Kindly make an video how to deploy in azure.