Monitor EKS & EC2 instances with MANAGED Prometheus & Grafana (Terraform & Prometheus Agent & AWS)

แชร์
ฝัง
  • เผยแพร่เมื่อ 2 ม.ค. 2025

ความคิดเห็น • 68

  • @AntonPutra
    @AntonPutra  ปีที่แล้ว +4

    🔴 - To support my channel, I’d like to offer Mentorship/On-the-Job Support/Consulting - me@antonputra.com
    👉 [UPDATED] AWS EKS Kubernetes Tutorial [NEW]: th-cam.com/play/PLiMWaCMwGJXnKY6XmeifEpjIfkWRo9v2l.html&si=wc6LIC5V2tD-Tzwl

  • @AntonPutra
    @AntonPutra  ปีที่แล้ว +1

    Get Full-Length High-Quality DevOps Tutorials for Free - Subscribe Now! - th-cam.com/users/AntonPutra

  • @AntonPutra
    @AntonPutra  ปีที่แล้ว +1

    🟢 [New] Terragrunt Tutorial: Create VPC, EKS from Scratch! (Step-by-Step) - th-cam.com/video/yduHaOj3XMg/w-d-xo.html

  • @AntonPutra
    @AntonPutra  2 ปีที่แล้ว +1

    🔴Part 2 - Send Alerts to Slack, Email, PagerDuty - AWS Managed Prometheus (AMP) - th-cam.com/video/SvDpuVlJTDg/w-d-xo.html

  • @Tszyu01
    @Tszyu01 ปีที่แล้ว +5

    This is actually great, helpful real world content. I’d expand on creating alerts using grafana as well as distributed tracing using tempo.

  • @ОлександрНіколайчук-ы5с
    @ОлександрНіколайчук-ы5с ปีที่แล้ว +1

    Hi Anton!
    Thanks for the video and your work for us.
    Do you have an example, where we build a separate host which we use as a monitor?
    That is, install Prometheus and Graphana on the instance and drag metrics from our nodes or instances there.
    The goal is to have a host that the user accesses and monitors all our AWS resources, both nodes and instances.

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว

      Thank you! Well, I have a video on how to build centralized monitoring based on Thanos. However, the use case you described is not easy to implement, since any external Prometheus would not get Kubernetes service discovery. You can, however, set up a remote write and push metrics to a single instance.

  • @thiagoscodeler5152
    @thiagoscodeler5152 ปีที่แล้ว +1

    Great content as always Anton. Thanks for sharing! Do you already have the content for setting up alerts?

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว +1

      Thanks, for EKS not yet

    • @thiagoscodeler5152
      @thiagoscodeler5152 ปีที่แล้ว

      @@AntonPutra Looking forward for that. Thank you.

  • @hollywoodmoviesexplainedhi4519
    @hollywoodmoviesexplainedhi4519 ปีที่แล้ว +1

    Hello anton, thanks for demo videomI would like to know if I can use single managed grafana for monitoring multiple eks cluster in different region

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว

      Sure, you just need to add multiple datasources for each env/region.

  • @gkalangara
    @gkalangara ปีที่แล้ว +2

    really great content. One question i have though=> what would be the easy way other than port-forward svc/grafana 3000 to access it outside of the type loadbancer

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว

      Thanks! Typically ingress would be the best choice.

  • @nforlife
    @nforlife 4 หลายเดือนก่อน

    wow, this is awesome you saved my life man!
    could you please make another video using Grafana cloud or Open Telemetry, Prometheus, Tempo and Loki

    • @AntonPutra
      @AntonPutra  4 หลายเดือนก่อน +1

      Thanks, I have one covering tempo with open telemetri if you are interested - th-cam.com/video/ZIN7H00ulQw/w-d-xo.html

    • @nforlife
      @nforlife 4 หลายเดือนก่อน

      @@AntonPutra thanks!

  • @azharsayyed1308
    @azharsayyed1308 23 วันที่ผ่านมา

    where do you apply the kubernetes files ? in myapp instnace ? its asking to install kubectl there

  • @kayoutube690
    @kayoutube690 ปีที่แล้ว +1

    do you have a video for installing grafana on eks?

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว

      There is nothing special about EKS and grafana, and yes I have bunch of example using terraform with helm and plain yaml. This is the last one - github.com/antonputra/tutorials/blob/main/lessons/173/terraform/6-monitoring.tf#L27-L43
      You can search for grafana in that repo

  • @JackReacher1
    @JackReacher1 ปีที่แล้ว +2

    Wait, ain't we just installing the whole prometheus on the cluster and just using prometheus agent?
    So, what is the advantage of using managed prometheus service by aws, it isn't even cloud agnostic?

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว +1

      Managed prometheus allows you to collect metrics in centralized place. For example if you have lots of environments, you don't have to switch VPN and grafana dashboards. Second it has a long term storage of 180 days. Typical retention for a standalone Prometheus is 7-14 days. You can also use it from other clouds and premise. BUT it can be very pricey, so you have to filter out metrics before you ship it to managed prometheus.

    • @JackReacher1
      @JackReacher1 ปีที่แล้ว

      @@AntonPutra Wow, those are some solid points.

    • @hasanbingolbali9423
      @hasanbingolbali9423 ปีที่แล้ว

      @AntanPutra Do we filter out the metrics in Aws level or in the services which are distributing metrics level?

  • @MrNewAmerican
    @MrNewAmerican 10 หลายเดือนก่อน +1

    Such great content, Thank you!

    • @AntonPutra
      @AntonPutra  10 หลายเดือนก่อน

      thanks!

  • @arjunkumarbetageri9791
    @arjunkumarbetageri9791 ปีที่แล้ว

    I have created resources as you have mentioned All pods and SVC are running fine. I'm using a linux machine to deploy terraform templates (EC2 instances) . How can I see prometheus and grafana on the web UI..? I tried all options as terraform creates EKS nodes in private subnets and we have not provisioned any loadbanacer also.

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว

      Well the easiest way to port forward, for example "kubectl port-forward svc/prometheus-operated 9090 -n monitoring"

  • @George-mk7lp
    @George-mk7lp 2 ปีที่แล้ว +1

    always greatest content , best channel ever

  • @shulyakav
    @shulyakav 2 ปีที่แล้ว +1

    Excellent! As always. )

    • @AntonPutra
      @AntonPutra  2 ปีที่แล้ว

      Thanks again Artem :)

  • @RachidMoysePolania
    @RachidMoysePolania ปีที่แล้ว +1

    Hi, im having issues to get the ec2-node-exporter job i've done everything like in the tutorial, but it doesnt work, could you give me a hand?

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว

      What's the issue? Do you have targets in the Prometheus but they timeout or simple not visible?

    • @RachidMoysePolania
      @RachidMoysePolania ปีที่แล้ว

      @@AntonPutra the ec2 crawler I don't know if it works because I create my instance with all the user-data and it works but when I go to the Prometheus I look for the ec2-node-exporter job in the latest dashboard that he imports is never showed, so I cant get my metrics from my ec2 instances (not kubernetes cluster nodes) in Prometheus

    • @sydefcon
      @sydefcon 7 หลายเดือนก่อน

      @@RachidMoysePolania Hey did you solved the issue, face the same issue myself

    • @RachidMoysePolania
      @RachidMoysePolania 6 หลายเดือนก่อน

      @@sydefcon sure, let me know if still need help, will be glad to help you

  • @sushmithashetty5324
    @sushmithashetty5324 ปีที่แล้ว

    How can we check the fargate containers are down or up using prometheus and grafana

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว

      fargate node == pod, use targets up metrics

  • @m18unet
    @m18unet ปีที่แล้ว

    Thanks for the great tutorial. I have two questions
    1. Do we need PV/PVC (any permanent storage) with Prometheus agent mode? Are pods stateless?
    2. Can I set the statefulset's replica number to 2 for HA in Prometheus agent mode? I want to associate these two agent replica with thanos receiver. Is it a bad idea?

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว +1

      Thanks.
      1. Even in "stateless" mode it uses some local storage for caching. I think you can try but I still use pvc. (also in stateless mode you need to use thanos ruler instead of local alertmanager)
      2. Yes you can, that's the only way for Prometheus HA mode, just don't forget to include external labels that Thanos can deduplicate metrics. Also HA mode will double network traffic and storage just something to keep in mind in terms of cost.

    • @m18unet
      @m18unet ปีที่แล้ว

      @@AntonPutra I understand very well. Thanks for your explanation 😊

  • @christianibiri
    @christianibiri 2 ปีที่แล้ว +1

    Love your channel :)

  • @thiagoscodeler5152
    @thiagoscodeler5152 ปีที่แล้ว

    Anton, I'm able to visualize metrics only for the monitoring namespace (not for my other namespaces). Do you have any clue? I'm using both Prometheus and Grafana Managed Services

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว +1

      first check targets in prometheus, you may need to update Prometheus selectors

    • @thiagoscodeler5152
      @thiagoscodeler5152 ปีที่แล้ว

      @@AntonPutra I can now visualize other namespaces metrics. Only not seeing node-exporter service monitor on Prometheus UI. When I go to Service Discovery I can see some undefined targets.

    • @thiagoscodeler5152
      @thiagoscodeler5152 ปีที่แล้ว

      I managed to fix that.

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว

      @@thiagoscodeler5152 cool

    • @thiagoscodeler5152
      @thiagoscodeler5152 ปีที่แล้ว

      @@AntonPutra any idea why I can only visualize Kube State Metrics related to the "monitoring" namespace?

  • @gpj-qo9cb
    @gpj-qo9cb 2 ปีที่แล้ว

    Can this configuration be used with your AWS EKS Fargate setup? Would additional setup be needed for Fargate?

    • @AntonPutra
      @AntonPutra  2 ปีที่แล้ว +1

      Yes, you can use it to deploy prometheus agent and managed prometheus (including IAM for service accounts). But the daemonset is not supported yet, so you won't be able to deploy cadvisor or node exporter github.com/aws/containers-roadmap/issues/971

  • @ziaurrehman4738
    @ziaurrehman4738 2 ปีที่แล้ว

    Hey do you have a plan to make a video on GCP managed prometheus with GKE and instance Group

    • @AntonPutra
      @AntonPutra  2 ปีที่แล้ว +1

      Hi Zia, yes one more for sending alerts and then GCP stuff

    • @ziaurrehman4738
      @ziaurrehman4738 2 ปีที่แล้ว

      @@AntonPutra perfect, thanks

  • @aleksanderfidelus
    @aleksanderfidelus 2 ปีที่แล้ว

    Could all of this be done with terraform to not include manual steps with kubectl?

    • @AntonPutra
      @AntonPutra  2 ปีที่แล้ว

      Of course Aleksander, just use Kubectl terraform provider (or helm)
      registry.terraform.io/providers/gavinbunney/kubectl/latest/docs

  • @shehzadmohammed6269
    @shehzadmohammed6269 2 ปีที่แล้ว

    I love you! Thank you for these videos~

    • @AntonPutra
      @AntonPutra  2 ปีที่แล้ว

      Thanks Shehzad :)

  • @aybukecabuk06
    @aybukecabuk06 ปีที่แล้ว

    Hi, how can i solve this error. (prometheus-agent-0 1/2 CrashLoopBackOff 3 (4s ago) 63s) ts=2023-02-10T07:48:26.216Z caller=main.go:1119 level=error err="error loading config from \"/etc/prometheus/config_out/prometheus.env.yaml\": one or more errors occurred while applying the new configuration (--config.file=\"/etc/prometheus/config_out/prometheus.env.yaml\")"

  • @pikachu3686
    @pikachu3686 10 หลายเดือนก่อน +1

    best video

    • @AntonPutra
      @AntonPutra  10 หลายเดือนก่อน

      thanks :)

  • @ladfloss
    @ladfloss ปีที่แล้ว

    Do you know how to do it using both aws managed grafana and prometheus?

  • @SarithaKollipaka
    @SarithaKollipaka ปีที่แล้ว

    Hi,I have followed each and every step,and executed but unable to port forwading promothes.
    ts=2023-07-13T04:24:16.794Z caller=refresh.go:99 level=error component="discovery manager scrape" discovery=ec2 msg="Unable to refresh target groups" err="could not describe instances: UnauthorizedOperation: You are not authorized to perform this operation.
    \tstatus code: 403, request id: 741f3c18-1c8a-4616-aed3-ac11f80d2ffc"
    I got this error and i am unable to connect loaclhost for prometheus .can please you help that. .

    • @AntonPutra
      @AntonPutra  ปีที่แล้ว

      This looks like the error that AWS returns when you don't have permissions. It's most likely that you have misconfigured Prometheus and it does not have permissions to access AWS.