Mythos Watch
The Facts·As of Jun 3, 2026

What actually happened with Mythos

A plain-language reference. Every claim here is sourced from the 96-story corpus. If you only read one page on the site, read this one.

What it is
The model
Confirmed
Claude Mythos Preview — Anthropic's first frontier model with autonomous cyber capability

Internally codenamed Capybara. Built on the same family as Claude Opus 4.x, trained to reason over long-horizon software-engineering and vulnerability-research tasks. Announced April 7, 2026, alongside a 244-page system card — the longest Anthropic has published.

Why it matters: first commercial model in this capability category.
Who can use it
Gated
52 vetted partners under Project Glasswing — not generally available

12 launch partners (AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan, Linux Foundation, Microsoft, Nvidia, Palo Alto, and one other) plus ~40 critical-software maintainers. Anthropic committed $100M in usage credits to fund partner work. No API access outside the program.

Why it matters: attacker access requires commoditization — not available today.
What it's for
Contested
Find-and-patch vulnerabilities at scale during the capability-lead window

Anthropic's framing: partners use Mythos to discover and remediate vulnerabilities in foundation software before comparable capability commoditizes. Critics (Schneier, Marcus) argue the 52-partner gating structure is as much commercial positioning as safety discipline.

What's uncertain: whether partners will publish patching outcome data.
Independent validation
Confirmed
UK AI Security Institute confirmed the capability at government level

AISI ran an independent evaluation. 73% success on expert-level hacking tasks that would have taken human professionals days of work. Importantly, AISI's prescribed response is not AI-specific — it's 'cybersecurity basics: updates, access controls, logging.'

Why it matters: removes 'marketing claim' as a dismissal path.
How this gets used

Three fundamentally different relationships with Mythos exist today. Most companies fall into column two.

Partners52 organizations
Find-and-patch at scale

Hand Mythos access to code and infrastructure. Receive prioritized vulnerability lists. Patch at a velocity that was not possible before. Goal: close as many known-exploitable vulnerabilities in foundation software as possible during the capability-lead window.

Example: Linux Foundation maintainers, CrowdStrike, JPMorgan.
Everyone elseMost enterprises
Prepare for commoditization

No direct access to Mythos. The work is preparation for a future where comparable capability reaches attackers: accelerate patching cadence, expand external attack surface discovery, tighten identity, reduce standing privilege. Same defender playbook, sharpened.

Example: Most banks, healthcare, retail, mid-market enterprise.
Vendor ecosystemSecurity suppliers
Adapt products to AI threat

Detection, identity, attack-surface, and supply-chain vendors ship AI-augmented defensive features and position them against AI-augmented attack. Caveat: Lakera red team found 94% of enterprise LLM deployments have exploitable prompt-injection surfaces. Defensive AI has gaps too.

Example: Microsoft Security Copilot, HiddenLayer, Mandiant, Lakera.
The numbers
AISI eval
73%
Success on expert-level hacking tasks, UK government independent test
Cybench
100%
Saturated the benchmark — no remaining headroom on that evaluation
Still unpatched
99%
Share of what Mythos found that remains unpatched in the wild (Lindner, Contrast)
Diffusion range
3mo – 2yr
Range of credible estimates for when comparable capability reaches attackers. Most analysts land 6-12 months.
The timeline
Mar 26
Fortune leak. CMS error at Anthropic exposes draft material referring to an unreleased model called 'Mythos.' Cybersecurity stocks drop 5-8%.
Apr 7
Formal announcement. Anthropic discloses Mythos Preview and Project Glasswing. 244-page system card published.
Apr 8
Government engagement begins. WSJ reports Treasury convening FSSCC call. FS-ISAC issues member guidance.
Apr 9
AISI independently confirms. UK AI Security Institute publishes evaluation. 73% on expert hacking tasks.
Apr 10
CISA advisory. Khlaaf + Marcus technical critiques land same day.
Apr 13
Practitioner skepticism crystallizes. Lindner (Contrast): '99% of what Mythos found is still unpatched.' Venables newsletter: 'no one doing the fundamentals has an emergency.'
Apr 15
CFR 'inflection point' essay. Highest-profile capability endorsement lands.
Apr 17
White House meeting. WH Chief of Staff meets Dario Amodei. CISA and US intelligence confirmed testing Mythos. EU Commission engages.
Apr 18
Industry parity signals. Axios reports OpenAI finalizing 'Trusted Access for Cyber' program.
Who has engaged, officially
Governments

UK AISI (evaluated), CISA (advisory), NIST AISI (guidance), White House (CoS meeting), EU Commission (Article 55 review), Canada AI Ministry (endorses gating), UK NCSC (echoes AISI), FinCEN (deepfake SAR), NYDFS (industry letter), OCC (AI model risk), DARPA (AIxCC), E-ISAC, H-ISAC, FS-ISAC.

Research institutes

CETaS / Alan Turing (primary analysis), CFR (two essays), RAND (diffusion update), CSIS (policy brief), Epoch AI (diffusion data), METR (task benchmarks), Mandiant (threat brief), Lawfare (liability analysis), Brookings, Georgetown CSET.

Press (major)

Bloomberg, WSJ, NYT, FT, Reuters, AP, Fortune, Axios, Scientific American, Economist, CNBC, PBS.

Named practitioners

David Lindner (Contrast), Alex Stamos (Corridor), Phil Venables (ex-Google Cloud), Heidy Khlaaf, Gary Marcus, Bruce Schneier, Zvi Mowshowitz, David Sacks (WH AI czar), Gordon Goldstein (CFR), Logan Graham (Anthropic), Helen Toner (Georgetown CSET).

What is not known
  • How much of the lead is real. AISLE research shows smaller open-source models recover much of Anthropic's showcased analysis. Size of the actual capability gap remains contested.
  • When commoditization happens. Credible estimates range from 3 months to 2 years, with most landing around 6-12 months. These are informed estimates, not forecasts. No disclosed in-wild incident yet.
  • Whether Glasswing will deliver measurable patching. Partners have not published 90-day patching outcome data. The defender-advantage thesis depends on this number.
  • Whether gating is safety or commerce. Andreessen publicly asked. Schneier flags the 52-partner structure as a market concentration deserving policy scrutiny.
  • What regulators will actually do. Meetings are not policy. EU Article 55 review and US export-control discussions are active but not finalized.
  • How other frontier labs will respond. Axios reports OpenAI finalizing 'Trusted Access for Cyber.' If confirmed, 'industry parity' becomes the dominant scenario within 6 months. Google DeepMind's posture is publicly unknown.
Based on 96 stories · 198 named voices · Jun 3, 2026
Methodology →