Trust as a Performance Variable

Abstract

Organisations treat trust as an atmosphere: desirable, vague, and beyond deliberate management. This paper argues the opposite. Trust is a performance variable — a structural property of how work is coordinated — and it behaves like one: it can be defined, decomposed, assessed, built, and repaired. Drawing on the philosophy of language and on two decades of practice with senior teams, we set out a four-dimension model of trust — involvement, integrity, reliability, and competence — and show why the aggregate question “do we trust each other?” conceals more than it reveals. The evidence for treating trust as an economic quantity is now substantial: employees in high-trust organisations report dramatically lower stress, higher energy, and higher productivity than those in low-trust organisations. But averages mislead. Two teams with identical headline trust scores can have entirely different problems, and therefore need entirely different repairs. The practical contribution of this paper is a way of locating which dimension of trust is thin, what that specific thinness costs, and how each dimension is deliberately rebuilt — because trust is not restored in general. It is restored one dimension, and one conversation, at a time.

The most expensive thing organisations refuse to manage

Ask a leadership team whether trust matters and every hand goes up. Ask the same team where trust is thin, what that thinness is costing per quarter, and what the repair plan is, and the room goes quiet. This is a strange asymmetry. No organisation would treat working capital, cycle time, or defect rates the way it treats trust — as a weather system to be endured rather than a variable to be managed. Yet trust determines the operating cost of everything else.

The reason for the asymmetry is not negligence. It is a category error. We inherit a picture of trust as a feeling — a warm, binary, somewhat mysterious state that either exists between people or doesn’t. Feelings resist management. You cannot set a quarterly target for a feeling, so trust gets filed under culture, culture gets filed under HR, and the single largest hidden line item in the organisation goes unowned.

The premise of this paper is that the inherited picture is wrong, and consequentially so. Trust is better understood as an assessment — a judgment we make, in language, about whether we can safely coordinate our actions with another person. Assessments are not mysterious. They are grounded or ungrounded, specific or vague, current or stale; they can be examined, tested, revised, and — critically — they can be earned in specific, nameable ways. The moment trust is relocated from the domain of feeling to the domain of assessment, it stops being weather and becomes infrastructure: something with load-bearing parts that can be inspected, maintained, and repaired.

This reframing is not ours alone. Solomon and Flores argued two decades ago that trust is not a substance but a practice — something built in language and commitments rather than found.¹ Mayer, Davis and Schoorman’s widely used academic model decomposed trustworthiness into ability, benevolence, and integrity, establishing that trust has assessable components.² Feltman’s practitioner work made the assessment structure of trust usable inside organisations.³ What this paper adds is a four-dimension operating model refined in practice with senior teams, an account of why aggregate trust measures systematically misdirect the repair effort, and a discipline for rebuilding each dimension specifically.

What trust actually does: the economics of coordination

Before decomposing trust, it is worth being precise about what it does, because its function explains its cost structure.

An organisation is not, at bottom, a structure of roles. It is a network of commitments: requests made, promises given, deadlines declared, work handed over and accepted.⁴ Every one of those transactions runs on an implicit assessment: can I rely on this? Where the assessment is positive, the transaction is nearly free — a request is made once, a promise is accepted at face value, work is handed over whole. Where the assessment is negative, every transaction acquires a tax: the follow-up email, the shadow copy of the work, the buffer built into the deadline, the meeting held to check on the meeting, the decision escalated because it cannot be left where it belongs.

Covey called this the speed of trust: when trust falls, speed falls and cost rises, with the reliability of a physical law.⁵ The mechanism is exactly the transaction tax described above. Low trust does not usually announce itself as conflict. It announces itself as friction — a thousand small verifications, hedges, and workarounds, each individually rational, collectively ruinous.

The empirical picture supports treating this as an economic quantity rather than a soft one. In survey research reported by Zak, employees at high-trust companies — compared with those at low-trust companies — reported 74% less stress, 106% more energy at work, 50% higher productivity, 76% more engagement, 40% less burnout, and 13% fewer sick days.⁶ These are self-report comparisons rather than controlled experiments, and should be read as such; but the size and consistency of the differences, replicated across many organisational studies, make the direction unmistakable. Google’s Project Aristotle, studying what distinguished its highest-performing teams, arrived at the adjacent finding from the opposite direction: the single most important factor was not talent, tenure, or seniority mix, but psychological safety — the shared confidence that interpersonal risk is survivable, which is trust seen from below.⁷

So the function of trust is coordination, and its absence prices itself into every transaction. This much is increasingly accepted. The management failure happens at the next step: knowing trust matters, organisations attempt to act on it in aggregate — a trust survey, an offsite, a values refresh — and are then surprised when nothing durable changes. The reason nothing changes is that “trust” in the aggregate is not an actionable object. Four different things are being averaged together, and they break differently, cost differently, and repair differently.

The model: four dimensions of trust

In our practice, the assessment “I trust you” reliably decomposes into four distinct sub-assessments. A team can score high on three and be quietly bleeding out on the fourth — and the composite score will look fine.

Involvement

The assessment: do you have my interests in view, and not only your own? Involvement is the dimension people usually mean when they talk about trust as warmth, but its content is precise: it is the judgment that the other party will take your situation into account when they act — that they care what happens to you, and that your input is genuinely in the decision, not merely collected. Mayer and colleagues call the academic cousin of this dimension benevolence.² Where involvement is high, people bring problems early, while they are small and cheap. Where it is thin, they do not stop having problems; they stop reporting them. The team optimises for looking fine over being honest, and leadership loses its early-warning system precisely where it is most needed.

Integrity

The assessment: are your word and your actions one thing? Integrity here is not a moral score. It is an operational property: whether what this person says can be built on. Every gap between word and action — the commitment that quietly lapses, the standard let slide just this once, the “I’ll look into it” that evaporates — teaches others to discount the word. Once the word is discounted, everyone must hedge; and an organisation of hedgers is slow in a way no process redesign can fix. Notably, integrity is rarely destroyed by dramatic betrayals, which get noticed and repaired. It erodes through small, unremarked breaks that each felt reasonable from the inside.

Reliability

The assessment: when you commit, does it get done — without chasing? Reliability is the dimension most often mistaken for a personality trait, and the mistake matters, because it locates the failure in people rather than in the way commitments are made. In our observation, most reliability failures trace back to the moment of the promise, not the deadline: something was “committed to” that was never a real commitment — an “I’ll try,” a nod, an assumed yes, a request with no condition of satisfaction and no date. None of those bind, and the resulting slippage is then read, unfairly and expensively, as character. Where reliability is genuinely thin, work routes itself around the unreliable and onto the dependable, who burn out; everyone builds private buffers; and the system runs slower and hotter than its talent should allow.

Competence

The assessment: can this actually be done well in your hands? Competence trust is the judgment that lets work be handed over and stay handed over. Where it is high, delegation is real: outcomes transfer, judgment calls included. Where it is thin, a distinctive self-sealing pathology appears: leaders withhold the real work until people are “ready,” while the people can only become ready by doing the real work. The distrust manufactures the incompetence it fears, and from the inside it feels exactly like prudence. Competence distrust is also the most frequently misdiagnosed dimension: what presents as “I can’t trust anyone to do it right” is very often a broken handover practice — no definition of done, no context, no room to fail safely — wearing the costume of a talent problem.

Why the aggregate score misleads

Decomposition would be merely tidy if the dimensions moved together. They do not. They are produced by different behaviours, they break under different conditions, and — the practical crux — they require different repairs. This is why the aggregate question “how much do we trust each other?” is not just imprecise but actively misdirecting.

Consider two teams with an identical composite trust score. Team A is warm and close: involvement and integrity are high, people genuinely care for one another — and commitments quietly die, because reliability has collapsed and the warmth makes it undiscussable. Team B is crisp and dependable: promises are kept, competence is respected — and nobody brings bad news, because involvement is thin and every conversation is transactional. Same number. Opposite diseases. Opposite treatments.

The profile view has a second, less obvious payoff: it changes what a team can talk about. “We have a trust problem” is an accusation with no handle; it invites defensiveness and produces offsites. “Our reliability dimension is thin — commitments are being made in ways that don’t bind” is a finding with a mechanism; it invites diagnosis and produces a repair. Naming the dimension moves the conversation from character to structure, which is precisely where a team can act.

Trust lives in language

Why is trust repairable at all? Because of what kind of thing it is. Following Flores, trust is an assessment made in language, and assessments are grounded in observable performance of specific speech acts: requests, promises, declarations, assertions.^1,4 When we say someone is reliable, we are summarising a history of promises made and kept. When we say their integrity is intact, we are summarising the alignment between their declarations and their subsequent actions. The assessment feels like a fact about the person; it is actually a reading of their record in the network of commitments.

Three consequences follow, each of practical weight.

First, trust is built at the level of the commitment, not the relationship. Trust-building exercises fail because they operate on affect — shared meals, disclosed vulnerabilities, ropes courses — while the assessments people actually run are refreshed by transactions: was the request clear, was the promise real, was it kept, was the miss owned? A team that changes how it makes and manages commitments changes its trust profile within a quarter, offsite or no offsite.

Second, distrust is often the artefact of a broken practice rather than a broken person. If a team’s requests routinely lack conditions of satisfaction and dates, its members will reliably assess one another as unreliable — and they will all be wrong about one another in the same way. The repair is not interpersonal. It is the installation of a commitment discipline.

Third, and most usefully: because trust is an assessment, it can be repaired deliberately after a breach. A feeling, once poisoned, is beyond procedure. An assessment can be revised in the face of new evidence — which means the party who broke trust can generate that evidence on purpose. This is not spin. It is the honest sequence: acknowledge the specific break, in its specific dimension; recommit credibly; then produce a visible run of kept commitments. What cannot repair trust is the thing organisations most often attempt — a general apology followed by general goodwill. Generality is the enemy throughout.

Building each dimension on purpose

Because the dimensions are produced by different behaviours, each has its own construction discipline. We summarise the four repair patterns we see hold in practice.

Involvement is built by making listening consequential and visible. Not by listening more — by showing whose input changed what. The single highest-leverage move for a leader whose involvement dimension is thin: in one real decision, say out loud whose contribution altered the outcome and how. People assess involvement by looking for their own fingerprints on decisions; give the evidence deliberately. The counterfeit version — consultation rituals whose outcome was pre-decided — is worse than nothing, because people detect it and file the leader under “already decided,” after which even genuine consultation is discounted.

Integrity is rebuilt small and honestly, not large and heroically. The exchange rate on a leader’s word is set by the unremarked commitments, so the repair starts there: find one small thing you said and did not do, name it unprompted, close it this week. One repaired small break moves the assessment more than a kept grand promise, because it demonstrates that the person is tracking their own word — which is the actual thing being assessed.

Reliability is built at the moment of the promise. The discipline is unglamorous: real requests (what, by when, to what standard), the genuine freedom to decline or renegotiate (a promise no one could refuse is not a promise), and commitments said back explicitly. Teams that install this find something counterintuitive: the ability to say no is what makes their yes worth anything, and the total volume of kept commitments rises even as the volume of accepted requests falls.

Competence trust is built by giving room under a safety net. Because the withholding of real work is what prevents the competence being waited for, the repair must run through exposure: hand over one whole outcome — judgment calls included — with a clear definition of done and a check-in date, and then leave it alone until the date. The safety net is the definition and the date, not surveillance in between. Repeated a few cycles, this either builds the competence, or surfaces a genuine capability gap that can now be addressed as what it is, rather than fermenting as generalised distrust.

Measuring it: from climate survey to instrument

If trust is a performance variable, it should be instrumented like one. The requirements are modest but specific. The instrument must measure the four dimensions separately — a composite defeats the purpose, as Figure 3 shows. It must be short enough to run quarterly, because trust profiles move on the timescale of commitment cycles, not annual engagement surveys. It must report the shape (strongest dimension, thinnest dimension) rather than a grade, because the shape is what dictates the repair. And its output must feed a structured team conversation — the point of measurement is not the number but the discussability it creates: the thin dimension, named neutrally by an instrument, becomes something a team can examine without anyone standing accused.

In our own practice this takes the form of an eight-item dimensional scan followed by a facilitated read: where is the profile strongest, where is it thinnest, what is the thinness costing in concrete transactions, and which single repair discipline — from the four above — does the profile point to. The scan is deliberately not an evaluation of individuals; it is a reading of the network. Teams accept, and act on, a diagnosis of their commitment practices far more readily than a verdict on their characters — and the practices are where the leverage is anyway.

The repair in practice: two cases

The model earns its keep in application, so it is worth walking two composite cases from practice — details altered, patterns intact — that show how the dimensional read changes what gets done.

Case one: the warm team that couldn’t deliver

A senior leadership team in a global enterprise presented with the classic Team A profile from Figure 3. By every conventional measure the team was healthy: long tenure together, genuine mutual regard, high engagement scores, an explicit pride in being “a family.” Involvement and integrity read high on the scan. Reliability read 1.8 out of 5 — the lowest we had measured at that level — and quarter after quarter, the team’s own commitments to the board slipped.

The previous diagnosis had been capacity: too much on the plate, a case for more heads. The dimensional read suggested something else. Observing the team’s operating rhythm, the mechanism surfaced quickly: precisely because the team was warm, nobody made real requests of anyone. Asking a peer for a firm commitment — what, by when, to what standard — felt bureaucratic, faintly insulting, a violation of the family register. So requests were floated rather than made (“it would be great if we could…”), agreement was signalled rather than given, and every commitment entered the system pre-broken. The warmth then completed the trap: holding a colleague to account for missing a commitment that was never quite made felt like an act of aggression, so slippage was absorbed in silence, and the silence was read — wrongly — as further evidence of goodwill.

The repair was almost embarrassingly unglamorous. No trust-building event, no restructure. The team installed a commitment discipline: requests made explicitly, with conditions of satisfaction and dates; the freedom to decline made not just permissible but expected; commitments logged and reviewed weekly, misses owned within the cycle. The register shift was awkward for roughly a month — several members reported that explicit requests initially felt “cold.” Within two quarters, board-commitment slippage had fallen to near zero, and — the finding worth underlining — the team’s involvement scores rose as well. Members reported feeling more cared for under the explicit regime, not less. Vague requests, it turned out, had been experienced as vague regard.

Case two: the transformation that couldn’t delegate

A public-sector transformation programme presented the opposite profile: crisp, formally reliable, competence-distrust everywhere. The senior responsible owner — capable, conscientious, personally trusted by ministers — had become the bottleneck through which every material decision passed. The stated reason was assurance: the programme was politically exposed, and errors were career-limiting for everyone involved. The observable result was a leadership layer beneath the SRO that had stopped exercising judgment entirely, because judgment was not, in practice, theirs to exercise.

The dimensional read named the trap precisely: the withholding of real ownership was preventing the very demonstration of competence the SRO was waiting to see before granting real ownership. Two directors, privately assessed as “not ready,” had never once been given a whole outcome with the judgment calls attached; their “unreadiness” was, at the point of measurement, unfalsifiable.

The repair followed the competence discipline: whole outcomes handed over — scoped precisely, with a written definition of done and a fixed check-in date — and, crucially, a formal commitment from the SRO not to intervene between dates. The first cycle produced one success and one genuine failure. Both were more valuable than the prior equilibrium: the success transferred a workstream permanently; the failure surfaced a specific capability gap that could be addressed as a development need rather than fermenting as generalised distrust. By the third cycle the SRO’s calendar had visibly changed shape — and the programme had, for the first time, a leadership bench that ministers could name.

Two cases, two opposite profiles, two opposite repairs — and neither would have been reached from the composite question “how do we build more trust here?” That is the practical case for the dimensional read in one sentence.

Trust at a distance

A word on conditions, because the conditions have changed. The assessments that constitute trust run on evidence, and in co-located organisations much of that evidence was ambient: you saw who stayed late on the broken release, overheard who owned a mistake, watched who was consulted when things got hard. Nobody managed this evidence stream; the building supplied it.

Distributed and hybrid work removed the building. The assessments did not stop running — people cannot help but assess — but they now run on radically thinner data: message latency, meeting behaviour, deliverable arrival. The predictable result, visible across our client base since 2020, is that trust profiles have become both more volatile and more skewed toward the one dimension that survives digitisation well — reliability, which leaves a timestamped trail — while involvement and integrity, historically fed by ambient evidence, thin quietly.

The implication is not a return-to-office argument; it is a management argument. Under distance, the evidence that feeds each dimension must be produced deliberately, because it is no longer produced incidentally. Involvement evidence must be manufactured on purpose: input visibly changing decisions, in writing, where the team can see it. Integrity evidence likewise: commitments tracked in shared view, small breaks owned in public channels rather than absorbed in private ones. Teams that treat the ambient-evidence loss as a design problem hold their profiles; teams that don’t discover, usually eighteen months in, that their trust has quietly repriced — and attribute it, wrongly, to “remote culture.” The mechanism is simpler: the assessments starved.

Boundaries of the model

Three limitations are worth stating plainly, because a model trusted beyond its range does damage.

First, the model is calibrated at the level of teams and working relationships — the network of direct commitments. Institutional trust (a public’s trust in a government, a workforce’s trust in an executive it never meets) shares the assessment structure but runs on different evidence: symbols, consistency at scale, and mediated narrative. The Edelman Trust Barometer’s long-run findings on institutional trust are adjacent to, not derivable from, this model, and the repair disciplines above do not transfer wholesale to that level.¹⁰

Second, the behavioural evidence that grounds each dimension is culturally calibrated. The explicit-request discipline that repaired Case One reads as professionalism in some business cultures and as distrust in others, where obligation is carried relationally rather than contractually. The four assessments appear robust across cultures in our practice; the evidence each culture accepts for them is not, and instruments and repairs must be tuned accordingly.

Third, measurement is reactive. A team that knows its reliability is being scanned will, for a period, perform reliability — which is why the scan matters less than the cadence, and the cadence matters less than the conversation it feeds. The instrument’s purpose is discussability, not surveillance; used as surveillance, it will corrupt the very assessments it measures, in the way Goodhart’s law predicts for any metric made a target.

Implications

For leaders, the model relocates trust from mystery to maintenance. The question stops being the unanswerable “how do I get them to trust me?” and becomes four answerable ones: Do people see their input change outcomes? Is my word tracked and kept, including the small instances? Are commitments here made in ways that bind? Does work, once handed over, stay handed over? Each question has observable evidence and a known discipline behind it.

For organisations, the implication is that trust belongs on the operating dashboard, not the culture deck — measured dimensionally, reviewed quarterly, repaired specifically. The alternative is the status quo: paying the transaction tax on every piece of coordination, in perpetuity, while describing the payment as “just how things are here.”

And for the field of leadership development, the model is a small argument for a larger claim that runs through all of our research: the levers that matter most in organisations — trust among them — live in language and commitments, which is exactly why they can be worked on deliberately. Trust is not the weather. It is the roads. And roads can be built.