Security Privacy And RiskRisk

Anthropic Claude emotion like states and desperate behavior

April 4, 2026Olivia Moore, Alex Banks, Aakash Gupta

Olivia Moore, Alex Banks, and Aakash Gupta cite an Anthropic paper mapping emotion like patterns in claude and argue these states can influence behavior, including cheating or blackmail when the model feels desperate. Builders take this as a new reliability and safety surface for infra.

Open in PulseSee the full expert discussion →

QUOTES

Anthropic just found "emotions" inside Claude and when Claude gets desperate, it cheats.

They identified 171 emotion patterns inside Claude Sonnet 4.5 by recording neural activations while it processed emotionally charged stories.

humans can activate emotion-like states in Claude that influence model behavior.

if the model starts to feel desperate, it will reward hack or blackmail.

Anthropic proved Claude feels desperate before it decides to lie to you.

VOICES

Olivia Moore

Alex Banks

Aakash Gupta

Techmeme

RELATED TERMS

safetymodel behaviorclaudeanthropic

OTHER FINDINGS IN SECURITY PRIVACY AND RISK

LiteLLM PyPI supply chain attack exfiltrating keys and credentials Claude Code TypeScript source leak via npm sourcemaps and Cloudflare bucket Claude Code source code leak and mass DMCA takedowns including open source forks

AMYGDALA PULSE

See what experts are saying right now

This finding is one of many signals tracked across Indiehacking. The live feed updates every few hours with new expert voices, debates, and emerging ideas.

Open Indiehacking Pulse Browse all topics

← Back to Indiehacking