# [[Delta temporality]] ![[Delta temporality.svg]] **Delta temporality** is a way of reporting metrics in which each data point represents the *change* (the delta) in a value since the last reported measurement, rather than the running total. It is one of two temporalities defined by [[OpenTelemetry]] for cumulative metrics like counters and histograms; the other is [[Cumulative temporality]]. [^hartmann] In practice: - Under delta temporality, a counter might report `+12`, then `+7`, then `+3` on successive pushes — each value is only what has happened since the previous report. - Under cumulative temporality, the same counter reports `100`, then `112`, then `119`, then `122` — each value is the absolute running total. ## Alternatives The main alternative is [[Cumulative temporality]], which is what [[Prometheus]] uses natively. In a cumulative model, counters only ever go up (until a reset), and the delta between two points is computed by the query engine — not by the producer. [[PromQL]] functions like `rate()` and `increase()` exist precisely to derive deltas from cumulative values at query time. Because temporality is a client-side choice, a pipeline can also *convert* between the two — for example, a collector or backend can translate delta OTLP metrics into cumulative form before storing them in a Prometheus-compatible store. ## Why it's a big deal Delta temporality matters because it is the default shape of metrics in [[OpenTelemetry]], and [[Prometheus]] — along with Prometheus-compatible backends like [[Grafana Mimir|Mimir]] — is built around cumulative temporality. The mismatch is one of the biggest remaining friction points between the two ecosystems. ### Resilience and data loss Temporality choice has real consequences for how resilient your metrics are to failure: [^hartmann] - **Delta temporality is push-first by design.** If a push containing a delta is lost to a network hiccup, dropped by the receiver, or the producer crashes before flushing, that increment is gone forever. You can't reconstruct it from later data points because later points only describe *their* window. - **Cumulative temporality degrades gracefully.** Because every point is an absolute value, losing some scrapes just means lower resolution — the next successful scrape still carries the full running total, and `rate()` / `increase()` can be computed across the gap. This is one of the core reasons [[Push-based monitoring vs. pull-based monitoring|pull-based monitoring]] is considered provably more robust. ### The "lost spikes" counter-argument A common defence of delta (and of push-based systems generally) is that pull-based, cumulative systems can miss spikes between scrapes. [[RichiH Hartmann|Richi Hartmann]]'s response: Prometheus deals in absolutes, so while a brief spike may not be visible as an instantaneous peak, the work it caused is still reflected in the counter's running total at the next scrape. [^hartmann] Delta systems, on the other hand, can lose that work entirely if the push carrying it is dropped. ### Ecosystem implications - [[OpenTelemetry]] defaults to delta, so any team standardising on OTel is producing delta-temporality metrics out of the box. - [[Prometheus]] and Prometheus Remote Write are built around cumulative semantics. Sending delta metrics into them requires conversion. - As of [[GrafanaCON 2026 - Mimir Community Call|Mimir 3.0]], Mimir's OTLP ingestion still treats **delta temporality as experimental / not fully supported**, while cumulative OTLP (resource attributes, exponential histograms, explicit bucket histograms, start timestamps) is supported. [[GLDBO 2025 - State of the Databases|At the 2025 Databases Offsite]], the upstream team reported "delta temporality is on its way" — still in progress, not done. - In the [[2025-05-21 OpenTelemetry Guild]] meeting, [[Arthur Silva Sens|Arthur]] asked whether delta support was more important than UTF-8 support, and [[Cyrille Le Clerc|Cyrille]] said yes — a useful signal of how high it sits on the priority list. So: delta temporality is a small-sounding semantics choice that turns out to be a load-bearing piece of the OTel ↔ Prometheus interop story, and a big part of why "just use OTel for metrics" is not yet a drop-in experience on Prometheus-shaped backends. ## See also - [[Cumulative temporality]] - [[Push-based monitoring vs. pull-based monitoring]] - [[OpenTelemetry]] - [[Prometheus]] - [[Prometheus Remote Write]] [^hartmann]: Hartmann, R. (2024). *Prometheus background and basics*. [[2024-07-10 Developer Advocacy Weekly|Internal meeting]]. %% # Excalidraw Data ## Text Elements ## Drawing ```compressed-json N4KAkARALgngDgUwgLgAQQQDwMYEMA2AlgCYBOuA7hADTgQBuCpAzoQPYB2KqATLZMzYBXUtiRoIACyhQ4zZAHoFAc0JRJQgEYA6bGwC2CgF7N6hbEcK4OCtptbErHALRY8RMpWdx8Q1TdIEfARcZgRmBShcZQUebR44gAYaOiCEfQQOKGZuAG0AXX4IXDg4AGUoqHFUUDBIdXTqiCJlaRS6hkIECgAhXGwAa2VSYQ5iAGE2fDZSbggAYgAzZZX2 yGwRQKyASSr9CpGBhEnp2Yl5gEYEK6u1iA3SLahd9L7B4dGJqZm5qHIOZhwXBPO4PJ4vfQAMUI+HwFRgwTmgg8oM2mWeewObCOAHUSOpuHxwOs0TtMX9sQh4YiJMiSKjHuiIQAlYStDjhHJoC78EmMsnpADyQOwahg3AuiUSvPupIx6UhnCgkNw+hh4rQAFYZWCmXtFVkyoQjNUeNLibL+fL9AAVLBQACCLS4EmCiygDPB5OBjsebAokhCxG4HCE sJ1cohAFExg6/QGg3NgSMqBGrRC4ymbfAmiMhGNPXqFeQMqyaahQ+GLcxsCNYQANbgAZgALHEeC3EgBOLsAdibFxbLc1A5lNbr+AAmtwAGxdxLaAAcTZnbc1vcXFx42otRjYBm4tQ69AIQmqF2JAF80170qz88QOcwueg8wWZcMSEaTYTzR1P8QFQIHA3A7v+pAkAAsmwxAIDGuCaMEwZoIsBBhB+EFfKch7EpAPRTMhr7KJouAABQ8BcvbULwlH URRVGoAumoAJRrJAzIIMoYbAnMpDEWRPBNtKvBCXRomMdoLEQFeN7oliRzClA7AAiGYb4DKiwlggHFjEwhAcMoOF1JAmQIUh3B/GevLrEQIFoJZCDWRAHCqtUDlOcIUBEByFmkFZuEQPowJHKQAByrm+f5xmBcFTDwYhCCEQ5MkWnYABWCDYNkZQuXA0GwfF5koWhjkWv0SmMDa+74EZHSwIgSJpFlylsbKzBQAY2YNWglbqQF0yDIl3Cofg6EBZ pBhlE1SmcMNJVOfgoSOs1lXVapsIpR0jjMGZxxKvakGZEIc2jaVHSaPpjgGQACoEixMJkYhzKZCXBjKhDMD0l36cohVDfZflnZAH2QSQcBsPpUA5SUcA/X9SWAzKmhsJgU3BM1nD5U0ehZLg+nSWA150IsMLhIel4gJeQA== ``` %%