Pillar 1 · ESG

Measurable AI: kWh, kg CO₂e, and £ per workflow.

Most AI sustainability claims live or die on the same question: where did the number come from? This page is the practical answer: the methods, the factors, and the tooling that produces evidence an auditor can trace.

The claim

For AI workloads running on infrastructure you control, every link in the carbon chain (tokens, compute, kilowatt-hours, grid factor, CO₂e, cost) can be measured per workflow. Not modelled. Not estimated from a vendor's median prompt. Measured, with the meter readings, runtime logs, and emission factors all sitting in one place.

For AI workloads running on someone else's infrastructure, four of those six steps are a private vendor metric. You can produce a plausible-looking row in a sustainability report by multiplying the bookends with a published average. You can't trace it.

Vendor averages are not primary data. The GHG Protocol is explicit: primary data from the reporting entity's own operations takes precedence over supplier averages. The full version of this argument is in "The ESG story cloud AI can't tell". This page is the toolkit underneath it.

The six-step chain

The reporting entity's job is to log every link, with a timestamp and a method.

  1. Tokens. Runtime log (vLLM, Ollama, llama.cpp, managed platform).
  2. Compute. GPU seconds, also in the runtime log.
  3. Kilowatt-hours. IPMI, RAPL, GPU sensors, smart PDU, or rack-level meter. Sometimes a calibrated estimate; always labelled as such.
  4. Grid-emission factor. Your contractual figure if you have a green tariff or PPA, otherwise the location-based figure for the jurisdiction and period.
  5. CO₂e. Derived from the two above. Provenance logged with the figure.
  6. Cost. Your electricity rate × kWh, plus depreciation if you're being thorough.

Every step has a method. Every method has an accuracy band. Both get reported.

Measurement methods we use

Eight ways to get a watt reading off a server. They are not equivalent. The accuracy label travels with the number.

Power measurement methods, what each measures, and accuracy band.
Method Measures Accuracy
IPMIWhole-system draw, baseboard sensorPrimary, vendor-calibrated
Intel RAPLCPU package + DRAMPrimary, silicon counter
TurbostatPer-core CPU package powerPrimary, derived from RAPL
HWMON / lm-sensorsBoard sensors, fan, voltage railsSecondary; coverage varies by board
LibreHardwareMonitorWindows-side cross-vendor sensorsSecondary; same caveat
Powermetrics (macOS)SoC + GPU on Apple SiliconPrimary, Apple-published counter
GPU sensor queriesNVIDIA / AMD board power, util %, VRAMPrimary, vendor counter
EstimationCalibrated formula (e.g. 40 W base + 10 W/core × CPU% + 3 W per 8 GB RAM)Tertiary; flagged in every report

VM power is allocated from the host proportionally to CPU usage. The allocation method is logged alongside the figure.

Disclosed factors

Every CO₂e and £ figure on this site uses one of these. Source, version, and date all sit on the page next to the number.

  • UK grid carbon intensity: 0.233 kg CO₂/kWh, DEFRA UK grid 2023 average. (For long-form Scope 2 reporting we also surface the latest DESNZ alongside it.)
  • UK electricity rate: £0.30/kWh, Ofgem UK 2024 average.
  • Annual projection: 24-hour rolling average × 8,760 hours. Labelled as a projection, not an actual.

If you operate on a green tariff or a PPA, your contractual factor replaces ours. The methodology is the same; only the number changes.

The Horizon Portal: proof point, not a slide deck

The Portal is the production system that ships every claim above. It runs across our customers' infrastructure and reports the same numbers we'd put in front of an auditor.

Per-agent power telemetry

Live wattage per node, with a CPU / RAM / GPU breakdown. Method-accuracy badges so the auditor (or you) can see at a glance where the number came from: green for IPMI/RAPL, amber for mixed methods, red for estimation.

24-hour and annual rollups

Fleet kWh, kg CO₂e, and GBP cost on a 24-hour rolling window, with annual projections and per-location breakdowns. Same numbers across the dashboard, the API, and the PDF report.

Branded ESG & Power report

Per-endpoint detail, GPU detail, VM allocation, methodology footer. The kind of artefact a sustainability team can hand to assurance without an apology.

Period comparison & cost allocation

Week-over-week and month-over-month deltas for kWh, kg CO₂e, and £. Cost allocation across companies based on actual consumption, useful for MSPs reporting per-customer impact.

Honest about the limits

  • Measurable is not low-carbon. Putting a meter on a workflow doesn't reduce its kWh. The grid mix and the runtime are what move the number. We tell you what the number is; you decide whether it's acceptable.
  • Estimation still has a place. Not every server has IPMI; not every laptop has powermetrics. When a calibrated estimate is the best available signal, we use it, and we flag it. A clearly-labelled estimate beats an unlabelled measurement.
  • Hybrid is usually right. Self-hosted closes the chain on routine, sensitive, predictable workloads. Frontier capabilities still come from cloud providers. Both can sit in the same architecture, with separate data-quality disclosures for each.
  • Audit-grade ≠ assurance-grade. SECR and CSRD require reasonable-assurance disclosures, not a stamp. Primary data is the substrate that reasonable assurance can rest on. The judgement still belongs to the auditor.

Read next