SageMaker serverless inference leverages AWS Lambda and offers your other SageMaker benefits such as ease of hosting a model using SageMaker SDKs and APIs and ability to switch between serverless and realtime endpoints.
Hey, is there a way to schedule jobs of inference using input images from an s3 on a given interval, lets say every night run inference on 100 images stored in s3
Hi, Currently you can only increase memory using MemorySizeInMB in the endpoint config. Increasing that will also increase CPU compute capability. If you need a GPU because of high performance needs, I recommend a real-time endpoint with a dedicated instance, especially if you can keep utilization high.
@@shashank.prasanna Thanks for the reply. But I think it will be more awesome if AWS provides GPU ML inference instances also for serverless needs. Because in my case, utilisation will be very less but I need a GPU instance to run my model. So I need to keep my instance always available :( But still I'll check out your recommendation on increasing MemorySizeinMB for CPU based ML models.
I got an error while invoking the endpoint - ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (415) from model with message "{"error": "Unsupported Media Type: application/x-image"}"
Thanks so much, this is exactly what I needed 😄
Super clear and very easy to understand!
Thanks Shreyasi! appreciate the feedback!
Loved the video! Thank you for making it:)
Thank you for your video! I am trying to find a similar example for PyTorch. Do you know any?
thank you!
Thanks alot. Can you compare aws lambda with sagemaker inference
SageMaker serverless inference leverages AWS Lambda and offers your other SageMaker benefits such as ease of hosting a model using SageMaker SDKs and APIs and ability to switch between serverless and realtime endpoints.
Hey, is there a way to schedule jobs of inference using input images from an s3 on a given interval, lets say every night run inference on 100 images stored in s3
Should we configure endpoint input and output manually or would it be created automatically?
Hello sir , please make vidoe that how to insert csv data to dynamodb in serverless framework using lambad in nodejs
thanks, Shashank can you please share the notebook if possible.
Hi you should find all the examples here: github.com/shashankprasanna/sagemaker-video-examples/tree/master/sagemaker-serverless-inference
Hi Shashank
Is there a GPU support for AWS serverless inference?
Hi, Currently you can only increase memory using MemorySizeInMB in the endpoint config. Increasing that will also increase CPU compute capability. If you need a GPU because of high performance needs, I recommend a real-time endpoint with a dedicated instance, especially if you can keep utilization high.
@@shashank.prasanna Thanks for the reply. But I think it will be more awesome if AWS provides GPU ML inference instances also for serverless needs. Because in my case, utilisation will be very less but I need a GPU instance to run my model. So I need to keep my instance always available :(
But still I'll check out your recommendation on increasing MemorySizeinMB for CPU based ML models.
@@BalamurugaMuthumani I got the same question. Could you at the end make serverless work with GPU?
I have a question related to processing job. How can I parameterize inputs?
How much time does the Serverless Sagemaker endpoints to stay active once invoked?
I got an error while invoking the endpoint - ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (415) from model with message "{"error": "Unsupported Media Type: application/x-image"}"
did you ever solve this? I have same issue right now from the example notebook...