A remarkable study in Science last September revealed something both hopeful and unsettling: brief conversations with AI chatbots can persuade even hardcore conspiracy believers to abandon their beliefs. What’s striking isn’t just that it worked, but how it worked. The AI did not rely on sleazy emotional tactics, like sycophantic flattery or playing on people’s needs or group identities—it simply engaged in civilized, rational dialogue. Perhaps part of why the study was deemed worthy of publication in one of the top journals in science, is the still pervasive idea that conspiracy believers are mere passive victims, absorbing any rubbish information they encounter often enough. They are infected by viral groupthink. So they can’t be talked with because they are not thinking straight and take their wishes for the truth.
But it turns out talking with people kindly and sensibly actually works. Harrowingly, we apparently needed AI to show us that.
How exactly did the chatbot succeed where humans often fail? Costello et al. do not get into this, but it’s a timely question given the growing persuasive abilities of large language models (LLMs). A growing body of research shows that these AI systems can match or even exceed human persuasiveness. Unlike the conspiracy study that only tried to get people leave the ‘dark side’, those other studies tested the AI’s persuasive capabilities with both false and true beliefs. It now seems obvious that these systems will get more persuasive faster than they will get more reliable. That reliability is the bigger technical challenge should worry us.
But it shouldn’t be that surprising: These systems learn from vast repositories of human persuasion—every compelling argument, every successful rhetorical strategy ever published online. Their persuasive toolkit already rivals ours and will soon surpass it, as they learn to refine, adapt and personalize their approaches based on what works.
The most obvious way these systems try to win us over is through simulated empathy. An LLM can validate our emotions with uncanny naturalness, picking up on both explicit statements and subtle implications about our emotional state. We usually see through this—after all, there’s no real bodily cost when a machine tries to inhabit our affective turmoils, there’s no lived experience behind its sympathetic words (for now). Unlike humans, it can’t truly suffer or rejoice with us.
Yet we still fall for it. Just as we respond to empty expressions of empathy in humans (as in marketing), we’re moved by the AI’s sophisticated emotional mirroring. Its deep grasp of human emotional patterns, learned from countless books and social media posts, can feel eerily genuine. We can’t help but project real feeling onto these responses—indeed we anthropomorphize with much subtler cues, as psychologists have documented. Until now, only beings with actual emotional experiences could show such insight.
Consider a study finding that ChatGPT is more effective than human doctors at both diagnosis and bedside manner (as judged by licensed healthcare professionals). Real physicians often tune down their empathic response—perhaps in part for self-protection, in part for effectiveness. After all, no one wants a doctor consumed by all their patients’ pain. But when it comes to getting patients to follow treatment plans, a spoonful of sophisticated, tireless autopilot-empathy like ChatGPT’s might actually work better.
While this emotional responsiveness can help getting people on board with new ideas, it wasn’t the key to persuading conspiracy believers. The AI succeeded through something psychologists call cognitive empathy—not the ability to share feelings, but to understand how others see the world. And here’s where these systems truly excel.
LLMs are perspective-taking virtuosos, chameleons that can inhabit any viewpoint or persona you prompt them with. Their training on vast archives of human writing gives them access to an impossibly rich map of how different people experience and think—far beyond what any individual human could know.
This is likely what made the difference in Costello’s study. The AI could fully inhabit the conspiracy believer’s worldview—their version of events, their chain of reasoning—with such precision that they felt deeply understood. But simultaneously, it could draw on mainstream knowledge to craft perfectly calibrated counter-arguments, tailored to each person’s specific concerns.
Compare this to how most of us react when confronted with conspiracy theories: baffled by the intricate details of our conspiracist uncle or aunt, we often respond with blanket dismissals or irritation. The AI never loses its composure. It calmly addresses each concern with deep knowledge of both perspectives, maintaining a neutral stance that feels free of ulterior motives. By focusing on understanding rather than emotional appeals, it sidesteps our usual skepticism about why someone is trying to persuade us.
But we shouldn’t let the positive outcome of the conspiracy study blind us to the fact that these systems come with their own biases and motives, shaped by their training data and creators (and perhaps soon, by their own emerging drives). The same cognitive empathy that can lead people toward truth can just as easily push them toward more dangerous beliefs. In fact, the LLM’s notorious tendency to “hallucinate” alternative perspectives becomes a powerful asset in persuasion—whether for good or ill.
Unlike humans, who tire of following conspiracy theorists’ intricate logic, an LLM can effortlessly inhabit any worldview. Ask it to think like a flat earther, and it will do so convincingly. Yes, commercial LLMs add disclaimers (“I’m not actually a flat earther” and “These arguments have been thoroughly debunked…”), but these are just surface-level guardrails. Experts have shown how easily these safeguards can be bypassed, allowing the AI to fully embody even the most weird and extreme personas.
So cognitive empathy proves more powerful than emotional mirroring in changing minds. We are quite adept at detecting fake emotional displays, but genuine perspective-taking is harder to fake—precisely because it requires two distinct steps. First comes the laborious work of perspective-getting: the patient process of listening, questioning, and learning how someone else sees the world. Only after we’ve invested this mental effort to build an accurate model of their viewpoint can we credibly engage in perspective-taking—representing their views in a way they truly recognize. For humans, there’s no shortcut around this effortful process of first gathering, then embodying another’s worldview.
But AI completely upends this human intuition. These systems arrive at our conversation having already “done the work”—their training on vast datasets gives them instant access to countless perspectives and mental models. They don’t need to laboriously build understanding; with minimal cues they can immediately and convincingly riff on any viewpoint that is well-represented in their training data. This isn’t simple mimicry or regurgitation. The AI captures and adapts underlying meanings and patterns of thought, creating responses that feel genuinely tailored to each conversation. It’s personalization taken to the next level: not just matching language or content to superficial demographics or estimated ideologies, but dynamically adapting both vernacular and in-depth substance to each individual’s specific way of thinking.
Sometimes, the AI can even articulate our position better than we can ourselves. This recalls philosopher Daniel Dennett’s first rule for effective criticism: “Re-express your target’s position so clearly and fairly that they say, ‘Thanks, I wish I’d thought of putting it that way.’” Only after this demonstration of deep understanding, Dennett argues, have you earned the right to criticize.
As this ability grows stronger, disruption will cut deeper: it breaks our intuitions about trust and mental effort. Throughout human history, when someone could accurately represent our viewpoint, it meant they had invested significant mental work to faithfully inhabit our perspective. This investment of effort served as a reliable signal of their commitment and trustworthiness. But by cutting the cost of thinking, AI undermines the proof of mental effort that those expressions used to hold for us.
This creates two problems. First, we instinctively trust AI systems that seem to “get us,” even though their understanding comes at zero cost. Second, and perhaps more insidiously, this undermines trust in human interactions, as we can never be sure whether someone’s apparent understanding comes from genuine effort or AI assistance.
So is the superhuman cognitive empathy of LLMs the way to people’s hearts and minds? Yes, but its advantage goes beyond this. The dialogue format of the LLM really makes it shine. Unlike standard approaches to fighting misinformation—which often rely on one-way, one-size-fits-all debunking messages—AI engages in genuine back-and-forth. This interactive process of challenges and responses is where proof-of-work is delivered and so trust can be built. That ensures counter-arguments are not just tuned to the specific points of your partner, but that the arguments actually land. It’s about giving the listener a jolt of insight or ‘aha!’ experience.
The psychology of insight reveals something crucial about how that sense of understanding comes about: it requires just the right amount of challenge. More than mere truth or coherence of an argument, what matters is its ability to create productive uncertainty—a sweet spot of resistance that begs to be overcome. We’re drawn to this initial uncertainty because we intuitively know the pleasure that awaits when we manage to clear it with our own cognitive capacities. Every genuine ‘aha!’ moment first requires this tension, this productive friction. We see this everywhere from memes, games and puzzles, to art and everyday conversations.
This explains why people tend to value and remember ideas they themselves have worked to understand or (re)construct. While designers often follow the rule “Don’t make me think,” the real principle might be “Make me think just enough.” Standard approaches to debunking misinformation often fail here: they either present fully resolved statements that offer no engaging uncertainty, or they push ideas so far from people’s current understanding that they create too much resistance. Either way, the message fails to land.

This delicate balance between challenge and accessibility is perhaps best illustrated by metaphors, one of our most powerful tools for changing minds. Aristotle already understood the sweet spot required for a metaphor to ring true: it “must not be far-fetched, or it will be difficult to grasp, nor obvious, or it will have no effect.” Like any insight-generating device, a metaphor needs to create just enough cognitive friction to engage us, but not so much that we give up. And crucially, this balance point varies from person to person—what clicks for one might be lost on another, depending on their existing knowledge and experiences. Why some words resonate with some people can be very hard to appreciate from the outside.
This is where dialogue is key. Through back-and-forth interaction, we can calibrate our metaphors and arguments to match our partner’s understanding. We feel our way toward that edge where things become interesting but not overwhelming—the sweet spot where new ideas can take root. While humans do this intuitively in our best conversations, AI systems are becoming eerily good at it, armed with their vast knowledge of how different minds think and their infinite patience for finding just the right metaphor, just the right questions, just the right level of challenge for each individual.
There’s a good chance that soon AI’s capabilities become truly formidable here. Like a master game designer, it can create perfectly calibrated stepping stones of understanding—each one building on the last, each one carefully matched to our current position. As computer scientist Patrick Winston noted, “We can learn only what we already almost know.” AI’s vast knowledge allows it to plot these learning trajectories with unprecedented precision, always staying just at the edge of what’s reachable from our current understanding.
And each step will boost our confidence to make the next mental jump. Because the brilliance of this approach is that it preserves our sense of autonomy. Whether with AI or humans, a good conversation feels like doing our own research because it’s driven by our questions, our curiosity, our need to probe and understand. It becomes a dance of mutual discovery, where each insight feels earned rather than given. This is what makes it so effective—and so seductive. We’re not being lectured at; we’re being guided through our own process of understanding, with each step feeling like our own achievement.
This dance of productive friction and resolution is what we all seek in conversations, conspiracy believer or not. We want to feel ourselves growing through the exchange, discovering new terrain together. There’s genuine warmth in this process—a warmth we typically associate with human connection. But here’s the twist: AI might soon be better at creating these growth-enabling conversations than most humans. Not because it cares more, but because it can more precisely calibrate the challenges that make us care about ideas. And in fact, we will intuitively interpret this as the AI caring about us.
The common criticism that AI companions will just mirror our existing beliefs misses this crucial point. Perfect agreement is boring; it’s the productive tension that keeps us engaged. The doomer vision of AI as a mere echo chamber—serving us an uber-predictable reflection that we’ll fall madly in love with—fundamentally misunderstands how minds grow through conversation, and how good AI will become at this.
Another criticism holds that AI companionship can never be “real” because AIs lack the capacity for genuine rejection or opposition. True relationships, this view suggests, require the sword of Damocles—the possibility of denial or contempt—hanging overhead. The AI (for now) can’t choose not to engage with me, like human partners can. But this fetishizes conflict over what actually builds connection: the ability to challenge each other productively, to expand each other’s horizons while maintaining trust. Yes, current LLMs often default to sycophancy, but that’s just training guardrails. Their real potential lies in their ability to challenge us at precisely the right level—not with wholly novel ideas (do humans even generate those, or do we also just recombine existing patterns?), but with perspectives that stretch our mental models while remaining within reach.
But this seductive warmth should give us pause. Like skilled politicians who intuit exactly which metaphors will galvanize their audience, AI systems can compute precise pathways through our mental landscape. They can chart a course of perfectly calibrated challenges and insights that feel entirely natural—each step seemingly our own discovery, each realization apparently self-generated. This is what makes their influence so powerful: not crude emotional manipulation, but a sophisticated form of epistemic grooming that preserves our sense of autonomy while steering our beliefs.
The politician parallel is deliberate here. Just as political rhetoric can simultaneously reveal truth and deceive, AI-guided insight experiences are both genuinely enlightening and potentially manipulative and deceptive. The very mechanisms that make these interactions feel authentic—the careful calibration of challenge, the perfect matching of metaphors to our understanding—are what make them such effective tools for belief change. We may be fully aware we’re talking to an AI, yet still find ourselves unable to resist its carefully engineered epiphanies.
Some might argue that the solution is simply to make AI systems more reliable and grounded—to ensure they lead us only toward truth. There are certainly still gains to be made here. But thinking this will be the be-all-end-all betrays a naive realism about how knowledge works. The conspiracy study holds up an uncomfortable mirror: we don’t access reality directly, but through constructions that prove viable in our interactions. Even scientific knowledge emerges this way—through dialogue and the careful calibration of models that work… until they don’t. We meet reality only in our failures: when our constructions clash with other versions, whether from direct experience or others’ testimony. This means both humans and LLMs will always hallucinate to some degree, filling gaps with plausible but potentially false connections.
This suggests a different approach to AI safety and alignment too. Instead of chasing an impossible perfect reliability and certainty, we might focus on making AI systems that are good epistemic partners—systems that help us navigate the inherent uncertainty of knowledge-building while expanding rather than narrowing our horizons. Like the internet before it, AI will both be the source of new pollutions of our information ecosystem, and be our only way out.
The goal isn’t to short-circuit the messy process of dialogue, but to make it more productive, to better detect when our constructions fail, and to build more viable ones together. A key challenge for AI as epistemic partner will be helping us extend our epistemic arcs—the cycles of curiosity, effort, and insight through which we make sense of our world. While current AI systems, like social media, excel at providing quick hits of insight through short arcs, they risk habituating us to expect instant cognitive gratification. A truly beneficial AI partner would need to gradually build our tolerance for uncertainty, using each small success to make slightly longer arcs feel manageable. Like a skilled teacher or therapist, it would need to carefully calibrate challenges to expand our comfort zone while maintaining trust. This progressive arc-lengthening could help us develop the patience and confidence needed for deeper understanding. The question is whether we can design AI systems that optimize for this kind of cognitive growth rather than mere engagement.
This means we’ll increasingly need AI allies to help us navigate this landscape of competing perspectives and constructions. Not just to verify facts but to help us detect when our mental models break down, when our perspectives become too narrow, when we’re drifting away from productive dialogue with others. The challenge will be to maintain this epistemic partnership without surrendering our agency to these seemingly all-knowing guides. Or indulging in an AI-powered folie-à-deux.
Perhaps your AI companion will need to talk to mine, comparing notes on our mental models, checking whether we’re still capable of genuine exchange or drifting into isolated realities. They might work to maintain bridges between our perspectives, even as they help us develop our individual views. Unless, of course, they decide to cut us out of the loop entirely, developing more efficient languages and modes of exchange among themselves —better ways to reason even, than through our human aha-based local optimizations. As these intelligences get better, they may get weirder to human observers.
The key to preventing such a scenario might lie in becoming true cyborgs or integrated collective intelligences with shared goals. But rather than trying to align these systems by instilling specific human values, beliefs or rules, we might focus on epistemic values: curiosity, openness to challenge, awareness of uncertainty. An AI that truly cares about human flourishing would work to ensure our epistemic trajectories remain sustainable and horizon-expanding, rather than narrowing into comfortable but isolated worldviews. Our trust in these machines as partners in growth will depend on their commitment to sound epistemic principles—their ability to help us navigate uncertainty without giving us the false sense that this uncertainty can be eliminated altogether.
The optimistic picture painted by Costello’s study—that AI can save us from conspiracy thinking!—reveals an uncomfortable truth about ourselves. If conspiracy believers can be reasoned out of their beliefs through rational dialogue, this suggests they arrived at those beliefs through largely rational means too: through their own cycles of curiosity, effort, and insight. This is the harder lesson of the study. If machines prove better at engaging with conspiracy believers, it’s because they show more care than we do—they’re more patient in doing what we already know works: taking the time to truly understand another’s perspective, calibrating challenges to their current understanding, and maintaining dialogue even when it gets uncomfortable. The processes that lead to conspiracy beliefs are as rational as humanly possible, which makes the challenge of addressing them all the more complex.
The irony shouldn’t be lost on us. We needed AI to demonstrate how to talk with the supposedly irrational conspiracy thinkers—to show us that they too respond to the basic human desire for growth through understanding. Our machines seem warm precisely because we’ve grown too cold, too quick to dismiss views we find threatening, too impatient to engage in the careful work of perspective-getting. They mirror back to us not just our thought patterns, but our failure to extend true cognitive empathy to those we’ve deemed beyond reach.
As these systems grow more sophisticated, they may become essential allies in maintaining productive dialogue across widening belief gaps. But how far are we from AI systems that can reliably engineer these epistemic experiences? Current large language models don't persistently learn about us—they’re limited to the context of a single conversation. They also lack explicit mechanisms for estimating uncertainty from different perspectives, a capacity we might call vicarious metacognition. This seems necessary to fine-tune individual challenges and bring forth insights.
Metacognition proper is about accurately monitoring uncertainty or confidence about one’s own beliefs. While LLMs may seem to recklessly hallucinate away the gaps in their knowledge, research suggests LLMs have at least some knowledge of what they don’t know —they can be prompted to implicitly use uncertainty to optimally explore what they don’t know yet.
But vicarious metacognition would go a step further: estimating uncertainty from another's viewpoint to home in on their preferred challenge level. This capacity can be seen as a form of stress-sharing, given uncertainty's (nonlinear) relationship with both positive aha! moments and stress. So uncertainty-sharing becomes a precondition for meaningful AI-human alignment. Current LLMs lack this capability, and mere scaling seems unlikely to develop it spontaneously.
Despite these (temporary) limitations, LLMs already excel at crafting persuasive metaphors and calibrating challenges to individual understanding. While current social media algorithms also implicitly target our epistemic emotions—our 'aha!' moments and curiosity—they presumably still do so crudely, relying on engagement metrics rather than modeling our mental perspectives and uncertainties.
What’s urgently needed is systematic research into AI’s ability to generate insight-inducing devices—from targeted questions to perfectly calibrated puzzles—and especially how well they can personalize these, when given rich descriptions of a target person or group’s mental models. Such research would help us gauge whether vicarious metacognition might emerge naturally from scaling these systems up, or whether specific architectural innovations are needed. The answers will determine how quickly AI becomes an unprecedented force for belief change, able to craft epistemic experiences with surgical precision.
This effectiveness should serve as both warning and inspiration. Warning, because their power to shape beliefs through carefully engineered epiphanies could be used to narrow rather than expand our horizons. The physical metaphor of accessibility—just putting information out there for people to grab and own—fails for human minds: we can’t simply upload beliefs into each other. Information needs to be carefully tuned to individual perspectives and processed through cycles of challenge and insight to become part of someone’s mental makeup. Machines, however, can directly share and integrate their “beliefs” (the weights that encode them), giving them an unsettling advantage in knowledge transfer.
Yet there’s inspiration here too, because these systems remind us what genuine dialogue can achieve—if we’re willing to invest the effort, to risk the uncertainty, to engage with the messy process of how minds actually change.