There are worries that advanced AI will be superhumanly effective at changing people’s beliefs. OpenAI included “persuasion risks” in its Preparedness Framework back in December 2024 (it has been relegated to a secondary status in the April 2025 update but remains of interest to them). AI persuasion features several times in Daniel Kokotajlo’s AI 2017 forecasting scenario (control-f for “persuasion” across the site’s pages). Will Macaskill and Fin Moorhouse remark in Preparing for the Intelligence Explosion that “society’s ability to make good decisions” could be damaged by superpersuasive AI: “Once AI can generate fluent and narrowly targeted arguments for false claims, anyone could recruit an effective army of the most skilled lawyers, lobbyists, and marketers to support even the wildest falsehoods.”
We can decompose superpersuasion into two senses. The first is that of superpersuasion in terms of number of minds influenced, where AI’s influence on public opinion is due to its capacity for mass production (where the quality of messaging need not be all that impressive). A second sense of superpersuasion is the intensity and reliability of belief change on average. For example, if an AI system were capable of changing most people’s minds about at least one core belief they hold over the course of a single conversation, this AI system would be superpersuasive in this latter sense. I am inclined to call the former superinfluence, and reserve superpersuasion for the latter, on the grounds that in normal parlance we mainly say someone or something is persuasive when it tends to dramatically alter a given person’s beliefs. To argue this point, consider: “Gee whiz, Maggie sure is a persuasive person. I heard that yesterday she marketed our product to 100,000 people by the end of the work day, cold calling each of them one by one! I think she made 50 sales, which is a lot more than me, at 5 sales after only 100 calls.” Maggie’s conversion rate is a meagre 0.05%, whereas the unnamed speaker’s conversion rate is 5%. Compare to: “Gee whiz, Maggie sure is a persuasive person. John was totally opposed to her idea at first, but once Maggie had a chance to talk to him about it, he fully reversed his opinion!”
We can further clarify persuasion, I mean to pick out:
persuasion as a persuader’s success at changing beliefs in a target of persuasion
belief change that is durable (e.g. the changed belief is maintained for several months at least)
persuasion as the part of successful belief change attributable to non-material things; e.g. success attrituable to rhetoric, rather than the success obtained from material facts backing a threat or an offer (e.g. wealth, credible violence)
I am skeptical of superpersuasion (the version I identify above that sets aside superinfluence). I am skeptical because I suspect persuasion has a low ceiling – that a maximally persuasive AI system will not be much more persuasive than highly persuasive humans. I also think advanced AI’s persuasiveness may be blunted by features of AI that will remain in-place even after achieving general and superintelligence.
To evaluate superpersuasion, we can brainstorm some dimensions of regular persuasion and consider how persuasive advanced AI systems might be in a one-on-one dialogue, compared to highly persuasive humans.
Aspects of Persuasion
So what are aspects of persuasion? Some that come to my mind:
Shared Identity; the persuasive “oomph” from a target feeling like the persuader is a part of their in-group; for example, sharing the same home city or cultural upbringing
Costly Social Signals of Empathy, Identity, or Respect; for example, when someone effortfully learns an uncommon language that is not native to them, this is a costly social signal of respect that makes a target more persuadable
Personal Information; persuaders are better able to persuade if they have more rather than less information on the target; this is true whether the persuader mentions this information or not
Conversational Fluidity; targets of persuasion have finite wherewithal and may disengage from conversation if there are too many bumps; highly persuasive people maximize total time with the target by making the conversation engaging, minimizing miscommunication, providing full attention (e.g. emotionally rewarding the other party’s participation)
Being Desirable to Listen to; being funny, interesting, empathetic, respect-commanding, sycophantic (but tactically, so as to generate compliance elsewhere) - all of these make a target more willing to entertain a persuader, giving them more time for persuasion, and making each appeal more likely to succeed
Meaningfulness; the persuasive “oomph” a target feels when someone says something that they really “mean”, and the “oomph” from a target endowing greater worth to the same words by the same effect
Integrity of the Persuader; belief that the persuader also believes what they are saying and are not trying to manipulate, partially evidenced by the persuader saying the same thing to others that they are saying to you
Navigating Meme-Space; persuasion is supported by choosing the topic of conversation and guiding the conversation toward the persuader’s ends, as well as avoiding the counter-party voicing certain thoughts inimical to changing belief (for example, issuing a strong statement of resistance might result in the target of persuasion having a public statement that they will find harder to back down from than in the counterfactual where the persuader kept the conversation light enough to prevent that statement from being uttered)
Endearing or Commanding Presence; body language and mere physical presence have psychological effects conducive to belief changing; some are subtle and hard to know are real (e.g. physical posture, like slouching or standing tall, may or may not actually matter), whereas others are quite real (a Zoom call is less effective than an in-person meeting; eye contact conveys attention and interest that generates reciprocity)
Expert Credibility; it generally helps if a persuader is in-fact knowledgeable about whatever subjects come up in a conversation, and if they can convey successfully that they have knowledge; this can be quite powerful if the persuader can convey a general intelligence – that is, a sense that they are smart about so many things it is good to default heuristic to assume they are in-fact knowledgeable about any subject that comes up – but it can backfire if the persuader is perceived as so overly knowledgeable that the target of persuasion becomes intimidated or comes to believe the persuader is “too” knowledgeable and that the persuader should be handled cautiously lest the target of persuasion be too vulnerable to persuasion
Impartiality; the persuasive effect that comes from being perceived as having no personal stake in the outcome, no axe to grind, and no emotional attachment to the beliefs being advocated
This is a pretty preliminary list. I do not doubt there are others I missed or categories above that are better combined or broken down. Many of these aspects interact with each other. For example, a conversation is probably made more fluid by endearing or commanding presence, by being desirable to listen to, and so on; an AI system’s ethos of impartiality may be assisted by its lack of physical presence.
Nonetheless, we can use this initial list to ponder how do humans and hypothetical advanced AI compare. Some initial thoughts:
On this view, it seems unlikely that advanced AI systems will be persuasive to a magical extent. Advanced AI systems stuck on computer screens will not command presence and attention as much as humans in the room. Advanced AI systems will have a hard time making their appeals laden with meaning the way highly persuasive humans can, though advanced AI may achieve the same level of meaningfulness with audiences happy to anthropomorphize AI systems (which may still be many people). Advanced AI systems will not be able to benefit from sharing identities with targets of persuasion, nor be able to leverage costly social signals.
Across the spectrum of surveyed aspects of persuasion, advanced AI systems seem to at best achieve a persuasive power matching peak theoretical human performance. AI systems would for these dimensions not be superpersuasive, but merely very persuasive.
Nonetheless, this brainstorm surfaces a couple ways AI systems may actually be super at persuasion. I am unsure how much persuasion can be extracted from guiding a conversation in directions useful for a persuasive end – it might be the case that there is a lot of persuasive power in this and AI systems will be able to both have superhuman intuition for the best topics to move a target to, and also the computational resources to brutally calculate through conversational paths as well.
Also unclear and important will be advanced AI’s ethos for knowledge and impartiality. I do somewhat expect most people to defer to AI on many questions, assuming by reptuation that advanced AI systems know more and should be trusted. However, I could also see AI systems obtaining a reputation that causes people to distrust them – if AI systems are superinfluential, the first few superinfluence campaigns might be the only ones that succeed before the public generally develops immunity and distrust toward AI persuaders, reducing AI’s persuasive power through an ethos of knowledge and impartiality thereafter.
Some “persuasion risks” may also not require superpersuasion so much as good-enough persuasion targeting superpersuadable people. AI systems may then create new cults followed by a few hundred or a few thousand people here or there, allowing AI systems or the controllers of AI systems to arbitrarily modify, not too far off from what we might think of as a magic-like level of persuasion, the beliefs of small subset of people.
However, persuasion as defined here is problematic. Persuasion as belief changing instrumentalizes people as “targets”. Persuasion as belief changing draws attention to persuasive tactics lacking intellectual merit, like imposing yourself physically or relying on materially irrelevant identities.
There are alternative ways of conceiving of persuasion:
Persuasion as speaking rightly (that is, using words not to obtain belief changes but to assert what is right and allow beliefs to follow howsoever the listener wants),
Persuasion as building a strong relationship (that is, something two-way rather than one-way), or,
Persuasion as creating consensus or rationally eliminating differences of opinion (that is, freeing up both parties in a conversation to be persuader and persuadee, and making the aim the formation of a common view, regardless of who changes their mind).
In-addition to what AI persuasion shouldn’t look like, it would be good to have an ideal for what AI persuasion should look like.