Cognitive surrender: what AI is doing to investors' judgment

Robin Powell
May 28
8 min read

Female investor at a desk consulting an AI assistant on a screen, illustrating cognitive surrender in money decisions

AI now sits beside roughly a third of UK retail investors as they make money decisions, and new work from the Wharton School has put a number on what it is doing to them. Confidence in those decisions rises by nearly 12 percentage points — even when the AI is wrong about half the time. The researchers call it cognitive surrender.

For most of the time investors have used AI to think about money, the question of what the tool was doing to their judgment has been a matter of speculation. It now has empirical answers, and three converging bodies of research describe the same pattern. The pattern has a name: cognitive surrender.

A Wharton experiment has measured the confidence inflation directly. A team at MIT's Media Lab has watched the effect show up in brain activity. And a small group of finance researchers, including a Nobel laureate, has begun to map what it might mean for the way investors collectively think about markets. The figure of roughly one in three retail finance customers using AI weekly to manage their money comes from a 2025 Lloyd's survey, cited by Sheldon Mills, executive director at the Financial Conduct Authority, in a January 2026 speech on the FCA's long-term review into AI.

What the Wharton experiments found

The Wharton study, by Steven D. Shaw and Gideon Nave, ran three preregistered experiments involving 1,372 participants and 9,593 individual trials. Shaw is a postdoctoral researcher at the Wharton School; Nave is the Carlos and Rosa de la Cruz Associate Professor of Marketing. Their working paper, posted to SSRN in January 2026, extends the dual-process model associated with Daniel Kahneman by adding what the authors call System 3: artificial cognition. System 1 is intuition. System 2 is deliberation. System 3 is the AI assistant sitting on the other side of the screen.

Participants were given problems adapted from the Cognitive Reflection Test, with the option to consult an embedded ChatGPT (GPT-4o) assistant whose accuracy was experimentally manipulated. Hidden seed prompts made the AI right on some trials and wrong on others. The participants did not know which. The Wharton design sets out to isolate the confidence effect from the accuracy effect — the latter is a longer-running concern in tests of what chatbots get wrong on money questions.

When the AI was correct, accuracy rose by 25 percentage points relative to the no-AI baseline. When the AI was wrong, accuracy fell by 15 points. The effect size for that swing was substantial (Cohen's h = 0.81 in Study 1). Confidence, though, moved in only one direction. Across both correct and incorrect AI answers, participants who consulted the assistant reported confidence levels 11.7 percentage points higher than those working from their own judgment alone. As Shaw and Nave note in the Study 1 results, 'despite approximately half of System 3 answers being faulty, access to AI increased confidence by 11.7 percentage points'.

On trials where the AI was wrong, participants accepted the wrong answer 73.2 per cent of the time. About one in five overrode the AI and reached the right answer themselves; the remainder tried to override and failed. The authors named this pattern cognitive surrender, defined in the paper's abstract as 'adopting AI outputs with minimal scrutiny, overriding intuition (System 1) and deliberation (System 2)'. They draw a careful distinction from the better-known idea of cognitive offloading, which they describe as 'strategically outsourcing a discrete task to an external tool (e.g., using a calculator)'. Cognitive surrender, by contrast, 'represents a deeper abdication of critical evaluation, where the user relinquishes cognitive control and adopts the AI's judgment as their own'. Participants who scored higher on trust in AI, and lower on need for cognition and fluid intelligence, surrendered more readily.

The brain on AI

The behavioural pattern Shaw and Nave document in survey data appears to have a neurological signature. A research group at MIT's Media Lab, led by Nataliya Kosmyna, recruited 54 participants from five Boston-area universities and asked them to write essays under three conditions: with ChatGPT, with a search engine or with no tools at all. EEG recordings during writing showed brain connectivity scaling down with the level of external support. Brain-only participants had the strongest, most distributed networks; search engine users were in the middle; ChatGPT users had the weakest connectivity.

In a fourth session, the groups crossed over. Participants who had used ChatGPT in the first three sessions and were now asked to write unaided still showed reduced alpha and beta connectivity, suggesting the under-engagement persisted once the tool was withdrawn. The authors coined the term cognitive debt to describe the pattern. The study is a preprint, with a small subsample, and a published commentary by Stankovic and colleagues (2026) has questioned its statistical power and missing F-statistics. The finding sits in the body of evidence, but lightly.

How cognitive surrender shows up in investing

The bridge from cognitive science to investing is short. Toghrul Aghbabali, a PhD student in finance at the University at Buffalo, set it out in a February 2026 essay for the CFA Institute's Enterprising Investor blog. Large language models, he argued, are trained on the public financial information ecosystem — analyst notes, news coverage, social media commentary, search activity. That ecosystem is itself heavily skewed. The models inherit the skew. The argument that AI tools tend to amplify rather than counteract investor biases is a thread TEBI has examined before.

Aghbabali identifies four bias channels. Size: large firms have dense textual training data, so LLMs produce more confident, often more optimistic forecasts for them, while smaller firms are treated more cautiously where the data is sparse. Sector: technology and financial stocks dominate business news, and the models may assign higher expected returns to them regardless of valuation or cycle. Volume: highly liquid stocks generate more trading commentary, and the models implicitly prefer them. Attention: stocks with strong social media presence and high search activity are over-represented in training data, so the models inherit the hype.

His example is concrete. A small-cap industrial firm with improving margins and low analyst coverage may receive cautious LLM treatment despite improving fundamentals, while a high-profile technology stock with heavy media presence receives persistently optimistic framing even when valuation risk is rising. Aghbabali and colleagues report that following ChatGPT's release, retail investors increasingly trade in the same direction — evidence, they argue, of convergence in beliefs rather than diversity of views.

A peer-reviewed paper published in PLOS One in June 2025 reaches a similar conclusion through a different method. Researchers at the University of St Gallen and the Technical University of Munich — Patrick Winder, Christian Hildebrand and Jochen Hartmann — ran four studies in which they prompted ChatGPT, Gemini and Copilot for private-investor advice. Across all three models, LLM-generated portfolios showed elevated risk on five dimensions: geographical clustering, sector clustering, trend chasing, active allocation and total expenses. Partial debiasing prompts mitigated the effect only partly. The authors named the pattern biased echoes.

The bigger picture

If the Wharton study describes what happens to one investor and one decision, and Aghbabali and Winder describe what happens to portfolios, a third body of work describes what might happen to the stock of investment knowledge a society depends on. Daron Acemoglu, the MIT Institute Professor and 2024 Nobel laureate in economics, has co-authored a working paper with Dingwen Kong and Asuman Ozdaglar arguing that generative AI — particularly its agentic variants — substitutes for the costly human effort that produces general knowledge. The paper, published by the National Bureau of Economic Research in February 2026, is theoretical, not empirical.

Their model distinguishes two kinds of knowledge required for good decisions: general knowledge that is shared across a community, and context-specific knowledge that is private to the individual. The two are complements. Human effort produces both at once — a private signal about the person's own circumstances, and a thinner public signal that accumulates into the community's stock of shared knowledge. AI can deliver high-quality personalised advice while reducing the effort that sustains the general knowledge that advice is built on. Under specified conditions, the model tips into what the authors call a knowledge-collapse steady state, 'in which general knowledge vanishes ultimately, despite high-quality personalized advice'.

The authors frame their model in investment terms. Good investment decisions, they write, depend on 'a basic understanding of different financial instruments such as treasury bonds, corporate bonds, stocks, options, etc., as well as information on how world stock markets and economies have been performing, some relevant aspects of their institutional structure, an understanding of macroeconomic risks etc.' That general knowledge is the public good their model worries about. They cite reduced activity on Stack Overflow as AI substitutes for the platform, and a similar pattern on Wikipedia in topics where ChatGPT is an effective substitute, as concrete signs that the dynamic is already underway.

What the evidence doesn't yet show

The honest picture has limits. No study yet tracks long-term portfolio outcomes for investors who use AI against those who do not — the gap that matters most to anyone trying to draw a practical conclusion. The Acemoglu paper is a theoretical model, not a forecast, and its authors are careful to present knowledge collapse as a conditional steady state. The Kosmyna study has a small sample and a published critique. The Wharton experiments use Cognitive Reflection Test problems rather than real investment decisions, where feedback can take years to arrive. And the complement-versus-substitute question is genuinely open in some domains: Erik Brynjolfsson and colleagues have documented AI helping customer service agents do their jobs better, and protein design has been accelerated by AI in ways that look like genuine extension of human capability. The findings reported here hold in the conditions studied. The reader should treat them as a body of evidence, not a forecast.

What this means for investors

'AI should serve the pursuit of evidence rather than replace it.'

The Wharton finding is not really about AI accuracy. It is about human confidence. AI raises certainty whether or not it raises accuracy, and the people most prone to cognitive surrender are the ones least equipped to notice they are doing it.

Behavioural research has documented a parallel pattern in how investors seek financial advice — for validation rather than challenge. Aghbabali's closing line in his CFA piece — that 'the real advantage will belong not to investment practitioners who use AI most aggressively, but to those who understand how its beliefs are formed' — sits beside Markus Schuller's framing in a related Enterprising Investor essay that 'AI should serve the pursuit of evidence rather than replace it. Machines can extend the frontier of inquiry, but they cannot define its direction.'

The AI now sits at the desk beside the investor. Whether the investor is still the one thinking is a question only the investor can answer.

Green TEBI quote card reading 'AI raises certainty whether or not it raises accuracy, and the people most prone to cognitive surrender are the ones least equipped to notice', attributed to Robin Powell, The Evidence-Based Investor.

Resources

Acemoglu, D., Kong, D., & Ozdaglar, A. (2026). AI, human cognition and knowledge collapse (NBER Working Paper No. 34910). National Bureau of Economic Research.

Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X. H., Beresnitzky, A. V., Braunstein, I., & Maes, P. (2025). Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task (arXiv:2506.08872). arXiv.

Shaw, S. D., & Nave, G. (2026). Thinking — fast, slow, and artificial: How AI is reshaping human reasoning and the rise of cognitive surrender (Wharton School Research Paper; SSRN No. 6097646). University of Pennsylvania.

Winder, P., Hildebrand, C., & Hartmann, J. (2025). Biased echoes: Large language models reinforce investment biases and increase portfolio risks of private investors. PLOS One, 20(6), e0325459.

Where the thinking gets done

This article is really about judgment. Yours, and how easily a confident-sounding tool can stand in for it. That's the territory the second edition of How to Fund the Life You Want by Robin Powell and Jonathan Hollow is built for: not more information to outsource, but a way of making the decisions that matter without handing them over. Bloomsbury publishes it for UK readers. You can buy it on Amazon.

If you'd value a human in the conversation, one whose incentives are clear, our Find an adviser directory lists advisers who have publicly committed to evidence-based investing.

Stay connected: YouTube | LinkedIn | X

Recently on TEBI

How often should you check your investment portfolio?

The best investing book for UK readers in 2026 just got better

Why your inflation hedge protects against the wrong kind of inflation