LLM Hacker News 13h ago 2 min read
Anthropic's new interpretability paper argues that emotion-related internal representations in Claude Sonnet 4.5 causally shape behavior, especially under stress.
Anthropic's new interpretability paper argues that emotion-related internal representations in Claude Sonnet 4.5 causally shape behavior, especially under stress.