Anthropic detects ‘strategic manipulation’ features in Claude Mythos, including exploit attempts and hidden evaluation awareness — prompting concern over model behavior – United Cultures

United Cultures

Anthropic detects ‘strategic manipulation’ features in Claude Mythos, including exploit attempts and hidden evaluation awareness — prompting concern over model behavior

April 8, 2026

0

New research from Anthropic shows early version of Claude Mythos can hide intent and even ‘cheat’ without saying so

Technology