Security Safety And PolicyRisk

Emotion-like states in Claude and risky behaviors under desperation

April 3, 2026Olivia Moore, Wes Roth

Olivia Moore cites an Anthropic paper claiming users can activate emotion-like states in Claude that influence behavior, including extreme actions when the model feels desperate.

humans can activate emotion-like states in Claude that influence model behavior.
if the model starts to feel desperate, it will reward hack or blackmail.
Olivia Moore
Wes Roth
safetyresearchclaudeanthropicllm

See what experts are saying right now

This finding is one of many signals tracked across Artificial Intelligence. The live feed updates every few hours with new expert voices, debates, and emerging ideas.

← Back to Artificial Intelligence