Anthropic Releases Frontier Mythos Model – But Reserves Most Powerful Version For Vetted Defenders

Claude Fable 5 has shipped to everyone, but with safeguards that route risky cyber and biology queries to a weaker model.

Published on Jun 10, 2026
Anthropic Releases Frontier Mythos Model - But Reserves Most Powerful Version For Vetted Defenders

Anthropic has released its most capable AI model to date in two tiers, drawing a line between what the general public can access and what it reserves for vetted cybersecurity defenders.

The company said the split is meant to make the model’s power broadly available while keeping its most dangerous capabilities away from malicious actors.

The widely available version, Claude Fable 5, launched on Jun. 9 with what Anthropic describes as state-of-the-art performance across software engineering, research, and other domains.

The restricted version, Claude Mythos 5, is the same underlying model with key safeguards removed. According to Anthropic, it has the “strongest cybersecurity capabilities of any model in the world”, and is being deployed initially to a limited set of cyber defenders through Project Glasswing, a program run with the US government.

Credit: Anthropic.

How the Safeguards Work

Rather than refusing risky requests outright, Fable 5 hands them off. The model runs classifiers, separate AI systems trained to detect misuse, that watch for queries touching three areas: cybersecurity, biology and chemistry, and attempts to copy the model’s capabilities.

When a classifier fires, the request is answered not by Fable 5 but by Claude Opus 4.8, a less capable model, and the user is told this has happened.

Anthropic said it tuned the classifiers conservatively, accepting that some harmless requests would be caught in order to release the model quickly. It put the fallback rate at under 5% of sessions and said it intends to narrow the safeguards over time.

The coverage is broad because the capabilities are dual-use: the company noted that skills that help a professional find software flaws could help an attacker exploit them.

A Point-in-Time Assurance

Anthropic put weight on the safeguards’ resistance to jailbreaking, saying an external bug bounty found no universal jailbreaks across more than 1,000 hours of testing, though it acknowledged the UK’s AI Safety Institute had made early progress toward one.

Sally Vincent, senior threat research engineer at Exabeam, said the deployment model was in some ways more interesting than the model itself, reflecting a growing recognition that safety is not just a training problem but also a matter of access controls and governance.

She added that jailbreak-resistance claims warrant caution because they capture a single moment in time. 

“Attackers continuously adapt, and the longer-term measure of effectiveness is how quickly providers can identify, respond to, and mitigate new bypass techniques as they emerge,” Vincent concluded.

For context, a text purporting to be Fable 5’s system prompt circulated on social media shortly after launch, though its authenticity could not be verified. Expert Insights contacted Anthropic for comment but did not immediately receive a response.