Understanding Prometheus Histograms | Motivation and Concepts, Instrumentation, Querying in PromQL
ฝัง
- เผยแพร่เมื่อ 5 ส.ค. 2024
- In this video, I explain Prometheus histograms (for now only the "classic" ones that have been in Prometheus for around a decade - I will make a separate video about the new "native" histograms once they are stable): What are histograms, why are they useful, how can you instrument your service code with histograms, how are histograms exposed as metrics to Prometheus, and how can we query them in PromQL to get quantiles/percentiles, heatmaps, request rates, or average request durations?
Link to the Prometheus histograms best practices:
prometheus.io/docs/practices/...
Also check out my other Prometheus training courses if you want to learn Prometheus in a structured way from the ground up:
training.promlabs.com/
Chapters:
00:00 Introduction
00:56 Motivation and histogram basics
01:22 Need to measure request durations / latency
01:37 Downsides of using event logging
01:56 Why a single gauge doesn't help us
02:28 Downsides of using Prometheus summary metrics
03:09 Prometheus histogram example for tracking request durations
04:32 How can we expose histograms as time series to Prometheus?
05:11 Cumulative histogram representation
05:40 The special "le" (less-than-or-equal) bucket upper bound label
06:12 Time series exposed from a histogram metric
07:30 Instrumentation - adding histograms to your code
07:44 Adding histograms without additional labels
09:07 Adding histograms with additional labels
10:06 Querying histograms with PromQL
10:44 Querying all bucket series of a histogram
11:23 Querying percentiles / quantiles using histogram_quantile()
13:32 Using rate() or increase() to limit a histogram to recent increases
14:42 Controlling the smoothing time window
15:05 Aggregating histograms and percentiles over label dimensions
17:58 Errors of quantile calculation and bucketing schemas
19:27 Showing histograms as a heatmap
20:36 Querying request rates using _count
20:57 Querying average request durations using _sum and _count
21:28 Outro & PromLabs Trainings
---------------------------------------------------------------------------
CREDITS: "Subscribe Button" by MrNumber112 • Free Download: Subscri... - วิทยาศาสตร์และเทคโนโลยี
Hi, just want to say that i chanced upon using prometheus while trying to conduct a load test. This is really interesting tool!
I deployed a postgres exporter to expose certain metrics from my postgres instance. Encountered some issues as there wasnt too many tutorials about how to do this in azure managed prometheus. But nonetheless managed to figure it out.
I watched most of your video in one ago just afew hours ago, and these are really useful knowledge, including those tips in grafana.
thank you!
Amazing
Great video! How do I display histograms in Grafana?
Good question, I'll make a quick follow-up video about using heatmaps in Grafana soon!
Histogram from the horse's mouth ❤, understood and luved it ! Thank you! Any code examples of enabling this for Python and Java?
Thanks! You can find Python examples at prometheus.github.io/client_python/instrumenting/histogram/ and Java at prometheus.github.io/client_java/getting-started/metric-types/#histogram
Great playlist, however it would be helpful if you can fix the order of the playlist from getting started to more advance topic like histogram and counter rates, etc
Thanks for the suggestion, I reordered the videos a bit. They're not really meant to be building on each other in any particular order or curriculum, so not sure if the new order is that much better :)
Hi bro, how to backup and restore Prometheus Data ?
You can use the snapshot API (prometheus.io/docs/prometheus/latest/querying/api/#snapshot) to create a snapshot of the entire TSDB that can be copied / backed-up to somewhere else. Just drop it back in place as Prometheus' data directory when you want to restore it. Triggering a snapshot via the API requires the admin API to be enabled via the "--web.enable-admin-api" command line flag.