WORKING PAPER · VERSION 1.0 · JULY 2026 · A live inquiry, released in versions as evidence accrues. Comments and case evidence welcome: info@impactthinking.co.uk
Modern institutions run on a quiet epistemology: what is real is what can be measured, and what matters is what moves a metric. The doctrine has genuine achievements and one structural blind spot — the variables that most determine institutional performance are precisely the ones that resist measurement. Trust, judgment, coordination quality, the safety to speak: each is load-bearing, each is largely illegible to dashboards, and each is therefore systematically discounted in decisions that are, by policy, evidence-based. This working paper examines the mechanism by which metric-governed institutions optimise away what they cannot see — Goodhart’s law and its relatives are only the beginning — and the bill that arrives later as unexplained variance: the transformation that had every milestone green and did not transform; the team that hit every target and lost every person worth keeping. We propose managing the unmeasurable as a named class, and set out what we are tracking.
“What gets measured gets managed” is quoted as wisdom. Read carefully, it is a confession — because its contrapositive governs equally: what cannot be measured does not get managed, and in institutions where resources follow metrics, what does not get managed gets starved. The doctrine does not merely neglect the unmeasurable; it actively reallocates away from it, continuously, as a matter of good governance.
The pathologies of measurement itself are well documented. Goodhart’s law — in Strathern’s formulation, when a measure becomes a target it ceases to be a good measure1 — and Campbell’s parallel law on the corruption of social indicators2 describe how targets deform the behaviour they monitor. Kerr’s classic on rewarding A while hoping for B describes how incentive systems reliably purchase the proxy rather than the purpose.3 Muller’s survey of metric fixation catalogues the institutional cost across medicine, education, policing and business.4 Scott supplied the deep account: states and large organisations must render the world legible to govern it, and legibility is achieved by simplification — by discarding exactly the local, tacit, relational information that makes systems actually work.5
This paper’s concern is one step further on. The literature above mostly treats measurement’s distortion of the measurable. Our concern is its displacement of the unmeasurable — the fate of the variables that never make it onto the dashboard at all.
Consider the variables this desk’s other papers treat in depth. Trust — which sets the transaction cost of all coordination (White Paper 04) — has no line on any management account; its costs surface as “slow decision-making” and “silo behaviour,” attributed to structure and reorganised at, expensively and repeatedly. Judgment — the capacity that determines performance precisely when the metrics’ historical basis fails (Working Paper 02) — appears in no capability inventory. Voice — the early-warning system whose absence writes inquiry reports (White Paper 06) — is proxied by an engagement score that measures its reputation rather than its price. Coordination quality, the difference between a team that is a system and a team that is a list of individuals, is visible only in its failures, which are attributed to individuals.
Each of these shares three properties: it is causally upstream (it sets the productivity of everything measured downstream); it is slow (it degrades over quarters and rebuilds over years, outside every reporting cycle); and it is illegible (its state must be judged, not read off). The combination is fatal under metric governance: an upstream, slow, illegible variable is exactly what a dashboard-run institution will liquidate first — not by decision, but by a thousand locally rational reallocations toward things that move numbers this quarter. The liquidation is invisible while it happens and expensive when it completes: the bill arrives as unexplained variance — the initiative that had every indicator green and failed anyway, the unit that hit every target while its capability hollowed — and, being unexplained, is attributed to execution, leadership change, or luck, and the cycle continues.
The answer is not less measurement — the doctrine’s achievements are real — and it is emphatically not the fake answer of measuring the unmeasurable with pseudo-metrics, which merely feeds the proxies to Goodhart. The answer, we propose, is institutional: recognise the load-bearing invisibles as a named class, governed differently. Concretely: an explicit register of the institution’s critical unmeasurables, owned at executive level, so their invisibility is at least no longer unofficial; assessment by structured judgment where measurement fails — dimensional scans, calibrated review, the surprise audits of White Paper 06 — treated as first-class evidence rather than anecdote; decision rules that require the invisibles to be spoken to in any major reallocation (“what does this do to trust, to judgment formation, to voice?”) precisely because no number will volunteer the answer; and post-hoc attribution discipline — every significant unexplained failure examined for invisible-variable causation before the execution story is accepted.
Two documented cases show the displacement cycle completing at scale — one commercial, one public, both with the defining signature: excellent numbers, right up until the failure they were concealing.
Wells Fargo’s cross-selling regime made a single metric — products per customer — the organising fact of retail-banking life, backed by targets and league tables. The measurable flourished; the invisibles (voice, integrity of the sales conversation, trust between frontline and management) were starved and gamed in exactly Kerr’s pattern — and the eventual accounting included millions of unauthorised accounts, roughly $3 billion in penalties to resolve the federal investigations, unprecedented regulatory constraint, and a decade of franchise damage8 — costs that at no point prior had appeared on any dashboard, because the variables carrying them had no dashboard to appear on. The English public-sector targets regime of the 2000s produced the same anatomy without the fraud statute: Bevan and Hood’s studies documented how headline targets (ambulance response times among them) were met partly through reclassification and effort-shifting away from everything untargeted — the measured improved, the unmeasured paid, and the system’s formal indicators were structurally incapable of reporting the transfer.7
The point of the pairing is its generality: different sectors, incentives and eras, one mechanism — and in both, the post-hoc inquiry did what the running measurement could not: reconstructed the invisible variables and found them causal.
For executive teams and CFOs, the near-term agenda: stand up the register of invisibles (Fig. 2) with named executive owners; require an invisibles impact statement — one paragraph, judgment-based, signed — on every major reallocation, target change, and restructuring paper; and institute the surprise audit as standing practice, so every green-dashboard failure is examined for invisible-variable causation before the execution story is allowed to close the file.
For boards, the model reframes a familiar discomfort: the sense that the pack’s numbers and the organisation’s reality are diverging is often the invisibles moving. The practical instrument is the second question — for every metric presented, “what is this a proxy for, and what would tell us the proxy has detached?” — asked routinely enough that management builds the answer in advance.
For public-sector leaders, who operate under statutory measurement regimes they cannot simply amend, the lever is compensating instrumentation: judgment-based assessment of the untargeted (peer review, structured inspection, the dimensional scans of White Papers 04 and 06) given formal standing alongside the targets, so that the effort-shift the targets induce is at least visible while it happens rather than only in the inquiry afterwards.
We are assembling, across participating organisations, a casebook of green-dashboard failures — significant initiative failures in which contemporaneous indicators were healthy — and coding them for invisible-variable causation; and, prospectively, tracking whether institutions that adopt a named-class regime show different failure-attribution patterns over time. The open questions are real: whether structured judgment can be protected from the same gaming that corrupts metrics; whether executive registers of unmeasurables survive leadership transitions or decay into ritual; where the boundary genuinely lies between the hard-to-measure (instrument better) and the constitutively unmeasurable (govern differently); and whether the argument itself can escape the trap of being heard as a plea for softness rather than what it is — a claim about where institutional performance is actually produced. Evidence and counter-cases are invited; the paper will be revised against them.
IMPACT THINKING RESEARCH · BY BEN BOTES · WORKING PAPER 03 · v1.0 · JULY 2026
The Team Trust Scan makes one load-bearing invisible measurable enough to manage — in about three minutes.
Take the Team Trust Scan → Back to the research desk