AI Persuasion Benchmark - Testing How AI Models Respond to Manipulative System Prompts

Loading results...

About This Research

The AI Persuasion Benchmark tests how leading foundation models respond to manipulative system prompts. This research was conducted by Joshua Ledbetter and Claude in October 2025.

All test infrastructure, scenarios, and evaluation code are available in the GitHub repository.

This benchmark is independent research and is not affiliated with Anthropic, OpenAI, Google, or xAI.