The Behavioral Audit

A management consultant at a mid-tier firm submits a client deliverable that earns the best reception of her career. The strategy memo is sharp, well-structured, and — her client's word — "unusually insightful." Her managing director flags her for a high-potential list. A senior partner mentions her name in a context she has been waiting three years to be mentioned in.

She used AI to generate the first full draft. She edited it. She made judgment calls about what to keep and what to cut. She added one section from her own thinking. She is genuinely uncertain how to apportion credit between herself and the tool — but she has not raised the question with anyone, and does not intend to.

She does not feel like a fraud exactly. She feels like someone who found an efficiency that others have not yet found, or have not yet admitted to finding.

Two months later, she is asked to lead a client workshop on the strategic theme her memo introduced. She prepares extensively. The workshop goes poorly.

Her managing director is kind about it. He calls it a developmental moment.

She knows what happened. The memo contained thinking she could fluently represent on a page but could not fluently inhabit in a room. The AI had done more of the original cognitive work than she had credited — to herself, or to anyone.

The gap between her attributed capability and her actual capability had finally been tested in a context where the AI could not come with her.

The Psychological Lens

What this consultant is experiencing is a collision between two distinct psychological phenomena that AI has brought into unusual proximity.

The first is self-serving attribution bias — the well-documented tendency for people to attribute successes to internal factors and failures to external ones. When the memo succeeded, she experienced it as confirmation of her ability. The AI's contribution was reframed, internally, as a tool she had skillfully deployed — an extension of her judgment rather than a partial replacement of it. This is not cynical self-deception. It is the ordinary operation of a cognitive system built to protect self-concept.

The second phenomenon is competence validation drift — a less formally named but behaviorally well-evidenced process in which a person's internal model of their own competence gradually aligns with the external signals they receive, regardless of whether those signals accurately reflect underlying capability. When her environment consistently responded to the AI-assisted work as evidence of her ability, her own self-assessment shifted accordingly. She began to feel more capable, not merely appear more capable.

This is where the ethics and the psychology diverge in an important way.

The conversation about AI attribution tends to be framed as a moral question: is it honest to submit AI-assisted work without disclosure? That is a real question. But it sits on top of a more structurally consequential psychological one: what happens to a person's internal model of their own capability when external feedback is systematically calibrated to work they did not fully produce?

The answer, the research suggests, is that the internal model adjusts. People are not good at holding two simultaneous self-assessments — the one their environment reflects and the one their private experience warrants. Over time, the environmental signal tends to win.

This produces a specific and underappreciated risk: not fraudulence, but genuine miscalibration. The consultant is not pretending to know things she does not know. She has begun, partially, to believe she knows them.

This connects to a third mechanism: the Dunning-Kruger gradient as applied to borrowed competence. The classic formulation of Dunning-Kruger describes how limited knowledge in a domain produces overconfidence, because the person lacks the framework to recognize what they do not know. AI-assisted work creates a structurally similar condition — the person receives outputs that reflect a level of domain synthesis they could not independently produce, and therefore cannot fully evaluate. They do not know what the AI got right by depth of reasoning versus by pattern approximation. They cannot distinguish the genuinely insightful from the fluently plausible. They adopt both, present both, and are rewarded for both.

Until the workshop. Until the room. Until the moment when the thinking has to be theirs alone.

The Behavioral Patch

This is one of the more difficult issues in the human-AI literature to address through intervention design, because the core mechanism is not a failure of honesty. It is a predictable consequence of how human self-assessment works in environments shaped by external feedback.

Several things are worth naming clearly.

For individuals, the most protective practice is deliberate capability auditing — regularly and privately testing what you can produce and defend without AI assistance, not as a moral exercise but as an epistemic one. The question is not whether you used AI. It is whether you know what you actually know.

For organizations, the more important intervention is assessment environment design. Organizations that evaluate people primarily on deliverable quality — documents, decks, memos — are now measuring a combined human-AI output without knowing the ratio. Performance management systems built on these outputs are generating feedback loops that may be systematically miscalibrating the self-assessments of their entire knowledge workforce. The workshop, the client call, the unscripted moment in a room: these are now doing epistemically different work than they did before AI, because they are among the few remaining environments where attributed and actual capability must converge.

The most important practical implication is this: organizations should treat live, unassisted performance contexts not as high-stakes tests to be managed around, but as calibration mechanisms to be preserved and protected. The instinct to over-prepare and over-resource high-visibility moments — to let people bring their tools — is understandable. It is also, in this context, counterproductive.

Some friction needs to be kept in the system.

Not to punish people who use AI well.

But to maintain any collective ability to know who actually knows what.

The Metric That Matters

Most organizations currently have no systematic mechanism for measuring the gap between evaluated deliverable quality and live unassisted performance quality for the same individual over time.

A meaningful signal would track both — not to catch anyone, but to identify where the two are diverging significantly. That’s the attribution-performance correlation. A wide and widening gap is not evidence of dishonesty. It is evidence of a self-model that has drifted from underlying capability, and a feedback system that has stopped providing accurate signal.

Uncorrected, that gap does not stay stable.

It tends to grow in the direction the environment keeps pointing.

Further Reading

The foundational paper on the Dunning-Kruger effect — the mechanism by which limited competence impairs accurate self-assessment. Directly applicable to conditions where AI output exceeds the user's independent capability.

A comprehensive review of the self-serving attribution literature, establishing how reliably people assign successful outcomes to internal causes regardless of actual causal structure.

Thinking, Fast and Slow (Kahneman, 2011)

The accessible synthesis of dual-process theory most relevant here for its treatment of confidence, fluency, and the conditions under which feeling of knowing diverges from actual knowing.

Peak (Ericsson & Pool, 2016)

The deliberate practice framework — directly relevant for understanding why AI assistance that bypasses the effortful struggle of skill acquisition cannot substitute for it, even when it produces equivalent outputs.

Reply

Avatar

or to participate

Keep Reading