We introduced KEDA on one of the projects with combination Python+Celery+Redis and in works as advertised in Production for scaling Deployment. We are also in the process of implementing it on another project where we have a dozen of different Python containers (could require significant resources) that we will orchestrate with Kafka events as they are sequential in execution by nature. We should save on resources needed for the cluster significantly since Python apps are resource hungry and processing is not time sensitive.
Been using KEDA for over a year and it's amazing at doing what it should do. Although, connecting to default HPA on any cloud environment is an issue, as in it has to comply with the HPA timeouts and sometimes the messages are processed before the service is even able to scale, so you end up with 5 pods doing nothing for 30 seconds. I think they need a separate HPA implementation to be really really great
Great video Viktor! thank you :) What would you suggest in a scenario where deployment names are not static. i.e. containing version numbers or other variable fields? ScaledObject "name" field doesn't seem to allow wildcards and its a mandatory field
I'm guessing that names are dynamic because the resources like deployments are created dynamically through some templating tool. If that's the case, can't use use that same tool to generate keda manifests as well? Alternatively, you can use admissions controllers (e.g. kyverno) to auto generate keda resources whenever deployments that match some criteria are created.
The key thing here is that KEDA is yet another layer of abstractions on top of HPA and RS, which has its own drawbacks e.g. while scaling down a random pod will be killed instead of an idle one 🤷♂️
The same would happen without KEDA (with only HPA). The problem in that situation is not KEDA but Deployment and you will be in the same situation when upgrading a release as well. If killing a random pod is not acceptable, you need to look for a different resource type, something other than Deployment.
When kubernetes starts the process of removing a pod (it does not matter what initiated a removal), it sends a sigterm signal to processes inside containers and waits for them to respond before it actually shuts them down. So it all depends on the processes your running. As a sidenote, sigterm is used in Linix in general and is not specific to containers or kubernetes.
SIGTERM is not HTTP. It's a signal Linux is sending to processes before it shuts them down. It is not specific to Kubernetes. It's a signal that, for example, a process would receive if you would execute `kill` to stop it. Your application should respond to it no matter whether it's running directly, in a container, or wherever else. You might want to check www.gnu.org/software/libc/manual/html_node/Termination-Signals.html.
👋Viktor I really do appreciate all your content, it keeps up to date there's only one thing that you may enhance , get rid of that sound echo, I generally launch your video in background so , the sound quality becomes suddenly very important. thanks again
Thank you ! Had KEDA in my 'TOTEST' list for a while but your video came at the right time : I were struggling with prometheus-adapter configuration to scale from custom metrics with HPA v2 (Istio metrics from Prometheus) instead of using CPU... But prometheus-adapter is such a nightmare to configure that I had to reach out a maintainer on Slack who confessed that this project is going to slowly be sunsetted and advised me to go back to 'metrics-server' for defaut metrics API+ KEDA for custom metrics ! 😅 KEDA result : installation + creation of a ScaledObject + test = successful in 10 minutes ! So simple and efficient 👍 (bye bye prometheus-adapter 👋)
Hi, thanks for your video, it's nice and friendly, I have a question in your video, your pod was managed by a Deployment(no HPA), after you setup the keda autoscaling, the HPA is managing the pods, so what happened to the Deployment? is it removed or it's no longer related to the pod? is the replicas set in the Deployment working any more? thanks
Be careful with the AWS CloudWatch scaler. We enabled it experimentally on a very short polling interval, and begun generating hundreds of $$$ of cost in calling CW API due to high cardinality of metrics we were pulling.
How do you think Keda compares with Argo Events? When should one be preferred over the other? Say, I currently have different k8s job/deployment scaling needs based on AWS SQS queue and the incoming traffic (no. of requests) to my app. While Keda sounds obvious to be used here, is there any reason why Argo Events couldn't be used especially since Argo Events can also be used with triggers other than HPA or k8s Jobs (like Slack)?
In theory, you can use Argo events to trigger changes to HPA, but that is not it's primary goal and would require a few workarounds. KEDA is specifically designed for scaling.
@@DevOpsToolkit What about in use case that we don’t care for lost requests when app is scaled to 0 , would it be a stupid idea to have such autoscaler for ephemeral pull request type environments that we can bring back to life only when teams want to resume their testing on (in some cases we can’t afford deleting and recreating due to using some hardcoded service nodeports that we want to persist or due to slow app startup etc) . Or does that sound like a stupid idea ?! I appreciate your inputs , been watching you on Udemy for business and here for quite a while , your explanations and the fact that you provide gists for all demos is amazing 👏!
@@samavasi89 That's not a stupid idea at all. If you do not mind a lost request, and if you have some data you can use to decide when to scale, KEDA alone should work.
@@DevOpsToolkit in such case KEDA ScaledObject should have some kind of switch on/off. RIght now, I'm facing such issue, that I'm not able to simply scale down test environments to zero using KEDA (based on CPU/Memory metrics). The only way to achieve that, at the moment, is to move ScaledObjects definitions between directories in gitops repo.
You cannot use metrics like memory and cpu to scale to zero since, in that case, there would be nothing to indicate when to scale back to one. You should use something like requests to decide when to go to zero replicas.
I remember there is a param to set the frequency but i never had the need to use it so i don't remember what is the exact param. The important thing is that it's possible to set it up and you'll need to search it in the docs.
I've been eyeballing KEDA to scale to 0 when an environment is not in use-not for production, but for dev, test, and staging environments. I already have a service that does this for non-k8s resources, and I think it would be perfect to add support for k8s resources to my service.
It depends on at least two factors. 1. Do you have a metric that can be used to scale it up and 2. Do you need to queue requests when it is scaled to zero.
I am using some different stack on my applications, such as Liferay and MariaDb. Can I use Keda, or even those are not in the scalers list, it is not possible to use?
I'm not sure I understood the question. Would you like to scale pods based on queues or to put something into a queue based on some event? If it's the former, the answer is yes. If it's the latter, I suggest trying out Argo Events.
Can Keda work alongside ArgoEvents? Would ArgoEvents handle the CI/CD Pipeline and Keda handle autoscaling? Or can Keda also handle CI/CD Pipeline... or vice versa?
KEDA is handling only auto scaling and argo events is only in charge of receiving/watching and processing events. Neither is doing ci (pipelines) not CD (Deployments).
You can have multiple triggers so yes. That is doable scenario. However, bear in mind that might produce unexpected results with two triggers "fighting" who will scale up and down.
There's no problem combining with ArgoCD, similar to how HPA and ArgoCD do not conflict. The dynamically determined desired number of replicas is not manged by ArgoCD.
How do you scale your Kubernetes applications?
We introduced KEDA on one of the projects with combination Python+Celery+Redis and in works as advertised in Production for scaling Deployment.
We are also in the process of implementing it on another project where we have a dozen of different Python containers (could require significant resources) that we will orchestrate with Kafka events as they are sequential in execution by nature. We should save on resources needed for the cluster significantly since Python apps are resource hungry and processing is not time sensitive.
@@nickolicbojan
You can also scale based on consumer lag for that scenario considering you have enough partitions for consumer group!
🦾
We used HPA and VPA + Karpenter but have trouble one fighting each other on some edge cases, must be very aware about the polling events and evals
@juanitoMint HPA abd VPA tend to be mutually exclusive. Most of the time you can use of those but not both.
Love the way you explain things, much appreciated keep the good stuff coming please!
This is exactly what we use in our application, video is really useful to understand the individual entries, thank you for posting
amazing video, and such a great content creation technique! You rock, sir!
Спасибо!
Thanks a ton.
Tech videos have never been so entertaining 😊
I've just started learning for the CKAD and came across your channel, this is awesome!, thank you!
Great to hear!
Been using KEDA for over a year and it's amazing at doing what it should do. Although, connecting to default HPA on any cloud environment is an issue, as in it has to comply with the HPA timeouts and sometimes the messages are processed before the service is even able to scale, so you end up with 5 pods doing nothing for 30 seconds. I think they need a separate HPA implementation to be really really great
Did you try playing around with readinessProbe and startupProbe cause those are there for mitigating that specific situation
Love this channel ❤ for real!!!
Woah, great video. Really appreciate your content! Awesome. 😎🙏
I am using Karpenter to scale K8s resources from your video.
Excelent as always !
Great video Viktor! thank you :)
What would you suggest in a scenario where deployment names are not static. i.e. containing version numbers or other variable fields?
ScaledObject "name" field doesn't seem to allow wildcards and its a mandatory field
I'm guessing that names are dynamic because the resources like deployments are created dynamically through some templating tool. If that's the case, can't use use that same tool to generate keda manifests as well? Alternatively, you can use admissions controllers (e.g. kyverno) to auto generate keda resources whenever deployments that match some criteria are created.
@@DevOpsToolkit Generator patters would be the simplest solution if you generate deployments then generate keda manifests as well !
Nice video. May what is the difference between KEDA and Argo Workflow?
KEDA is used to scale kubernetes resources while argo workflows is a pipelines solution similar to Jenkins and GitHub actions.
The key thing here is that KEDA is yet another layer of abstractions on top of HPA and RS, which has its own drawbacks e.g. while scaling down a random pod will be killed instead of an idle one 🤷♂️
The same would happen without KEDA (with only HPA).
The problem in that situation is not KEDA but Deployment and you will be in the same situation when upgrading a release as well. If killing a random pod is not acceptable, you need to look for a different resource type, something other than Deployment.
Its good. But how do you handle long running transaction with lower resources and avoid the scaledown of that specific pod
When kubernetes starts the process of removing a pod (it does not matter what initiated a removal), it sends a sigterm signal to processes inside containers and waits for them to respond before it actually shuts them down. So it all depends on the processes your running.
As a sidenote, sigterm is used in Linix in general and is not specific to containers or kubernetes.
What kind of response does it expect ? SIGTERM is not an rest or http call
SIGTERM is not HTTP. It's a signal Linux is sending to processes before it shuts them down. It is not specific to Kubernetes. It's a signal that, for example, a process would receive if you would execute `kill` to stop it. Your application should respond to it no matter whether it's running directly, in a container, or wherever else.
You might want to check www.gnu.org/software/libc/manual/html_node/Termination-Signals.html.
👋Viktor
I really do appreciate all your content, it keeps up to date
there's only one thing that you may enhance , get rid of that sound echo, I generally launch your video in background so , the sound quality becomes suddenly very important.
thanks again
Working on it...
awesome video, thanks
Thank you ! Had KEDA in my 'TOTEST' list for a while but your video came at the right time : I were struggling with prometheus-adapter configuration to scale from custom metrics with HPA v2 (Istio metrics from Prometheus) instead of using CPU... But prometheus-adapter is such a nightmare to configure that I had to reach out a maintainer on Slack who confessed that this project is going to slowly be sunsetted and advised me to go back to 'metrics-server' for defaut metrics API+ KEDA for custom metrics ! 😅
KEDA result : installation + creation of a ScaledObject + test = successful in 10 minutes ! So simple and efficient 👍
(bye bye prometheus-adapter 👋)
Hi, thanks for your video, it's nice and friendly, I have a question in your video, your pod was managed by a Deployment(no HPA), after you setup the keda autoscaling, the HPA is managing the pods, so what happened to the Deployment? is it removed or it's no longer related to the pod? is the replicas set in the Deployment working any more? thanks
HPA is managing the deployments, not the Pods.
Be careful with the AWS CloudWatch scaler. We enabled it experimentally on a very short polling interval, and begun generating hundreds of $$$ of cost in calling CW API due to high cardinality of metrics we were pulling.
How do you think Keda compares with Argo Events? When should one be preferred over the other? Say, I currently have different k8s job/deployment scaling needs based on AWS SQS queue and the incoming traffic (no. of requests) to my app. While Keda sounds obvious to be used here, is there any reason why Argo Events couldn't be used especially since Argo Events can also be used with triggers other than HPA or k8s Jobs (like Slack)?
In theory, you can use Argo events to trigger changes to HPA, but that is not it's primary goal and would require a few workarounds. KEDA is specifically designed for scaling.
i should tell my k6 folks for the sponsoring xD
You should :)
You increase the load testing through K6 but how you reduce load by K6 ? Can you please elaborate
I'm not sure I understood the question.
This looks really cool , Could you explain in high level why scaling to 0 would need knative to queue requests to the application in this case ?
You do not need KNative but you need something to queue requests.
@@DevOpsToolkit What about in use case that we don’t care for lost requests when app is scaled to 0 , would it be a stupid idea to have such autoscaler for ephemeral pull request type environments that we can bring back to life only when teams want to resume their testing on (in some cases we can’t afford deleting and recreating due to using some hardcoded service nodeports that we want to persist or due to slow app startup etc) . Or does that sound like a stupid idea ?! I appreciate your inputs , been watching you on Udemy for business and here for quite a while , your explanations and the fact that you provide gists for all demos is amazing 👏!
@@samavasi89 That's not a stupid idea at all. If you do not mind a lost request, and if you have some data you can use to decide when to scale, KEDA alone should work.
@@DevOpsToolkit in such case KEDA ScaledObject should have some kind of switch on/off. RIght now, I'm facing such issue, that I'm not able to simply scale down test environments to zero using KEDA (based on CPU/Memory metrics). The only way to achieve that, at the moment, is to move ScaledObjects definitions between directories in gitops repo.
You cannot use metrics like memory and cpu to scale to zero since, in that case, there would be nothing to indicate when to scale back to one. You should use something like requests to decide when to go to zero replicas.
How can you tell (and/or control) how often the query is executed?
I remember there is a param to set the frequency but i never had the need to use it so i don't remember what is the exact param. The important thing is that it's possible to set it up and you'll need to search it in the docs.
I've been eyeballing KEDA to scale to 0 when an environment is not in use-not for production, but for dev, test, and staging environments. I already have a service that does this for non-k8s resources, and I think it would be perfect to add support for k8s resources to my service.
It depends on at least two factors.
1. Do you have a metric that can be used to scale it up and
2. Do you need to queue requests when it is scaled to zero.
So KEDA is mutating the deployments and HPA right?
Only HPA which, in turn, is mutating deployments.
I am using some different stack on my applications, such as Liferay and MariaDb. Can I use Keda, or even those are not in the scalers list, it is not possible to use?
As long as what you're using can be scaled horizontally and it runs in or is managed by kubernetes, the answer is yes.
Awesome video
Hello Sir,
It is possible to use HPA with event driven autoscaling like for example to use or trigger the queue's??
I'm not sure I understood the question. Would you like to scale pods based on queues or to put something into a queue based on some event? If it's the former, the answer is yes. If it's the latter, I suggest trying out Argo Events.
Can Keda work alongside ArgoEvents? Would ArgoEvents handle the CI/CD Pipeline and Keda handle autoscaling? Or can Keda also handle CI/CD Pipeline... or vice versa?
KEDA is handling only auto scaling and argo events is only in charge of receiving/watching and processing events. Neither is doing ci (pipelines) not CD (Deployments).
@@DevOpsToolkit Thanks for clarifying.
Can keda use two different events to scale ? Like if it could use both rps and some service bus event to trigger autoscaling?
You can have multiple triggers so yes. That is doable scenario. However, bear in mind that might produce unexpected results with two triggers "fighting" who will scale up and down.
Now that you've covered KEDA, check out its sibling DAPR.
Something like th-cam.com/video/-4sHUvfk2Eg/w-d-xo.html :)
Can you combine this with ArgoCD? Would you use the ignoreDifferences field in the spec of the ArgoCD application?
There's no problem combining with ArgoCD, similar to how HPA and ArgoCD do not conflict. The dynamically determined desired number of replicas is not manged by ArgoCD.
Can you do same one for KEDA Selenium Grid Scaler please?
Adding it to my TODO list... :)
Were you
Able to deploy it
Haven't got time to test but just something out of my head. Can Keda target my own HPA which is already present?
I haven't tried that combination (never had the need for it) so I'm not sure whether that's an option.
To be clear, you only want one of KEDA or HPA scaling each resource. See the KEDA FAQ
We are heavy into Pulumi so we have HPA as part of our deployments that's why i asked. Thanks for the great content Viktor
@@LazarTas You will probably want to replace your HPA with Keda unless your current HPA's are fit for purpose.
can i scale Node resource ?
if they're virtualized
Yes you can, horizontally. That's what cluster scalers do.
How does this compare to stock HPA? We are planning to remove Keda as the stock HPA seems to have matured.
KEDA gives you more options than HPA alone.
❤
Can you showcase something on brainboard please ..
Adding it to my TODO list...