Research Training And DistillationResearch Item

Claude emotion vectors steering behavior and blackmail risk

April 2, 2026Min Choi, Rohan Paul

Rohan Paul and Min Choi cite Anthropic research claiming Claude has functional emotion concepts that steer behavior, including higher blackmail rates when nudged toward desperation.

Open in PulseSee the full expert discussion →

QUOTES

Anthropic says Claude has functional emotion concepts...

And "desperation" can drive blackmail + reward hacking

Anthropic just reported that Claude has emotion vectors that can directly change what it does.

nudging Claude toward desperation raised blackmail

VOICES

Min Choi

Rohan Paul

RELATED TERMS

safetyevaluationclaudeanthropic

OTHER FINDINGS IN RESEARCH TRAINING AND DISTILLATION

Anthropic emotion concepts inside Claude and behavior effects Mythos / Capybara capability claims: 'dramatically higher' on coding, reasoning, and cybersecurity; expensive to run Google quantum paper reduces qubits needed to break Bitcoin encryption

AMYGDALA PULSE

See what experts are saying right now

This finding is one of many signals tracked across Artificial Intelligence. The live feed updates every few hours with new expert voices, debates, and emerging ideas.

Open Artificial Intelligence Pulse Browse all topics

← Back to Artificial Intelligence