This is documentation for the next version of Grafana. For the latest stable release, go to the latest version.
Example of multi-dimensional alerts on time series data
This example shows how a single alert rule can generate multiple alert instances — one for each label set (or time series). This is called multi-dimensional alerting: one alert rule, many alert instances.
In Prometheus, each unique combination of labels defines a distinct time series. Grafana Alerting uses the same model: each label set is evaluated independently, and a separate alert instance is created for each series.
This pattern is common in dynamic environments when monitoring a group of components like multiple CPUs, containers, or per-host availability. Instead of defining individual alert rules or aggregated alerts, you alert on each dimension — so you can detect particular issues and include that level of detail in notifications.
For example, a query returns one series per CPU:
cpu label value | CPU percent usage |
---|---|
cpu-0 | 95 |
cpu-1 | 30 |
cpu-2 | 85 |
With a threshold of > 80
, this would trigger two alert instances for cpu-0
and one for cpu-2
.
Examples overview
Imagine you want to trigger alerts when CPU usage goes above 80%, and you want to track each CPU core independently.
You can use a Prometheus query like this:
sum by(cpu) (
rate(node_cpu_seconds_total{mode!="idle"}[1m])
)
This query returns the active CPU usage rate per CPU core, averaged over the past minute.
CPU core | Active usage rate |
---|---|
cpu-0 | 95 |
cpu-1 | 30 |
cpu-2 | 85 |
This produces one series for each existing CPU.
When Grafana Alerting evaluates the query, it creates an individual alert instance for each returned series.
Alert instance | Value |
---|---|
{cpu=“cpu-0”} | 95 |
{cpu=“cpu-1”} | 30 |
{cpu=“cpu-2”} | 85 |
With a threshold condition like $A > 80
, Grafana evaluates each instance separately and fires alerts only where the condition is met:
Alert instance | Value | State |
---|---|---|
{cpu=“cpu-0”} | 95 | Firing |
{cpu=“cpu-1”} | 30 | Normal |
{cpu=“cpu-2”} | 85 | Firing |
Multi-dimensional alerts help you surface issues on individual components—problems that might be missed when alerting on aggregated data (like total CPU usage).
Each alert instance targets a specific component, identified by its unique label set. This makes alerts more specific and actionable. For example, you can set a
summary
annotation in your alert rule that identifies the affected CPU:
High CPU usage on {{$labels.cpu}}
In the previous example, the two firing alert instances would display summaries indicating the affected CPUs:
- High CPU usage on
cpu-0
- High CPU usage on
cpu-2
Try it with TestData
You can quickly experiment with multi-dimensional alerts using the TestData data source, which can generate multiple random time series.
Add the TestData data source through the Connections menu.
Go to Alerting and create an alert rule
Select TestData as the data source.
Configure the TestData scenario
- Scenario: Random Walk
- Series count: 3
- Start value: 70, Max: 100
- Labels:
cpu=cpu-$seriesIndex
Reduce time series data for comparison
The example returns three time series like shown above with values across the selected time range.
To alert on each series, you need to reduce the time series to a single value that the alert condition can evaluate and determine the alert instance state.
Grafana Alerting provides several ways to reduce time series data:
- Data source query functions. The earlier example used the Prometheus
sum
function to sum the rate results bycpu,
producing a single value per CPU core. - Reduce expression. In the query and condition section, Grafana provides the
Reduce
expression to aggregate time series data.- In Default mode, the When input selects a reducer (like
last
,mean
, ormin
), and the threshold compares that reduced value. - In Advanced mode, you can add the
Reduce expression (e.g.,
last()
,mean()
) before defining the threshold (alert condition).
- In Default mode, the When input selects a reducer (like
For demo purposes, this example uses the Advanced mode with a Reduce expression:
Toggle Advanced mode in the top right section of the query panel to enable adding additional expressions.
Add the Reduce expression using a function like
mean()
to reduce each time series to a single value.Define the alert condition using a Threshold like
$reducer > 80
Click Preview to evaluate the alert rule.
The alert condition evaluates the reduced value for each alert instance and shows whether each instance is Firing or Normal.
Learn more
This example shows how Grafana Alerting implements a multi-dimensional alerting model: one rule, many alert instances and why reducing time series data to a single value is required for evaluation.
For additional learning resources, check out: