Anthropic AI model discovers security flaws in every major operating system

Anthropic unveiled Project Glasswing on Tuesday, a defensive cybersecurity initiative built around Claude Mythos Preview, restricting access to a model able to find high-severity vulnerabilities, including some in every major operating system and web browser.

Objective Facts

Anthropic announced this week that it had developed a powerful new model the company believes could "reshape cybersecurity," with its latest model, Mythos Preview, able to find "high-severity vulnerabilities, including some in every major operating system and web browser". The model can autonomously identify zero-day vulnerabilities and then construct working exploits across every major operating system and major web browser. Rather than releasing the model publicly, Anthropic unveiled Project Glasswing on Tuesday, a defensive cybersecurity initiative built around Claude Mythos Preview, with partners with controlled access including Amazon Web Services, Apple, Cisco, Google and Microsoft. Senator Mark Warner, vice chairman of the Senate Intelligence Committee, called on industry to accelerate patching as AI speeds up vulnerability discovery. The announcement arrives amid significant government concern: Treasury Secretary Scott Bessent and Jerome Powell met with CEOs of several U.S. banks to warn them about cybersecurity risks that Mythos Preview poses. Complicating matters, the Pentagon designated Anthropic a supply chain risk earlier this year because the company had declined to ease restrictions on domestic surveillance tools and autonomous weapons for Pentagon use.

Left-Leaning Perspective

Some security experts and software developers committed to open-source software argue the world would be safer if Mythos were released so that every defender, not just Anthropic's chosen partners, could use it to find and patch vulnerabilities. Jonathan Iwry of the Wharton Accountable AI Lab articulated the core concern: "Whatever the right judgment call is, the most striking aspect of this situation is how reliant we are on the judgment of a handful of private actors who aren't accountable to the public". This reflects a broader concern about corporate gatekeeping of critical infrastructure. Critics also question the authenticity of Anthropic's safety claims. Heidy Khlaaf, chief AI scientist at the AI Now Institute, said Anthropic's detailed blog post explaining the new vulnerabilities left out many key details needed to verify its claims, and warned against "taking these claims at face value" without more information. Frontier labs have been taking a harder line on distillation this year, with Anthropic publicly revealing attempts by Chinese firms to copy its models; blocking distillation eliminates advantages from using huge amounts of capital to scale, and the selective release approach gives labs a way to differentiate their enterprise offerings as the category becomes key to profitable deployment. If frontier labs routinely restrict models citing vague security concerns, it could stifle independent research and concentrate AI capabilities among a handful of well-funded companies, with academic researchers and smaller organizations losing access to cutting-edge tools.

Right-Leaning Perspective

Anthropic says the model is too dangerous to release to the general public; Newton Cheng, Frontier Red Team Cyber Lead at Anthropic, stated "We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities," warning that "given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout – for economies, public safety, and national security – could be severe". This position has official government backing: Vice President JD Vance and Treasury Secretary Scott Bessent last week questioned leading tech CEOs about AI security, and Bessent and Federal Reserve Chair Jerome Powell this week called a surprise meeting with the heads of the biggest U.S. banks to address the potential threat of Mythos. Defense and intelligence community voices support the restricted approach. Anthropic's defensive coalition approach suggests the company believes the model's offensive cyber potential is materially different from ordinary model-release risk; the challenge is not an abstract future alignment problem but a present-tense capability-diffusion problem—the question is who gets advanced cyber capability first and whether defenders can get a head start before similar systems spread more widely. Corporate partners echo this rationale: Microsoft stated that "we are entering a phase where cybersecurity is no longer bound by purely human capacity, and we look forward to partnering with Anthropic and the broader industry to evaluate emerging models, validate their effectiveness, encourage responsible use, and improve security outcomes for all". However, there is a complication: Anthropic's existing Pentagon conflict undermines its credibility on national security matters. The announcement comes as Anthropic and the Pentagon are in a legal standoff after the US Department of Defence labelled the company a supply chain risk in February over Anthropic's refusal to allow the use of its AI, Claude, in autonomous weapons and mass surveillance.

Deep Dive

In the past few months, AI models have become increasingly effective at finding security flaws in software; just three months into 2026, the cURL team has found and fixed more vulnerabilities than each of the previous two years. This broader context shows that vulnerability discovery acceleration is real and systemic, not unique to Mythos. Automated vulnerability discovery tools have existed for decades, and the gap between finding a bug and building a working exploit has always slowed attackers, but that gap is now substantially narrower. Anthropric's core claim hinges on the magnitude of Mythos's capability leap. The model found a 27-year-old vulnerability in OpenBSD across roughly 1,000 scaffold runs at a total cost under $20,000, and autonomously identified and fully exploited a 17-year-old remote code execution flaw in FreeBSD granting unauthenticated root access. However, the left-leaning critique is methodologically sound: when AISLE tested the same vulnerabilities with smaller models on isolated code segments, eight out of eight models detected Mythos's flagship FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens. The key difference is that Mythos appears to work autonomously on full codebases, while smaller models required pre-identification of vulnerable segments—a significant but perhaps not insurmountable gap. Looking forward, all of this is happening on a tight timeline, with frontier AI capabilities likely to advance substantially over just the next few months. The unresolved question is whether Anthropic's restricted-access approach genuinely gives defenders a meaningful head start, or whether competitors, including those in China, will release models with comparable hacking ability within six months to 12 months, making the window of exclusive access irrelevant.

Anthropic AI model discovers security flaws in every major operating system

Apr 13, 2026

What's Going On

Left says: Open-source advocates argue the world would be safer if Mythos were released to all defenders; Jonathan Iwry of the Wharton Accountable AI Lab notes that "the most striking aspect of this situation is how reliant we are on the judgment of a handful of private actors who aren't accountable to the public".

Right says: Anthropic warns that "The fallout—for economies, public safety, and national security—could be severe" and that "Project Glasswing is an urgent attempt to put these capabilities to work for defensive purposes".

✓ Common Ground

Both left and right agree that AI models have gotten dramatically better at finding bugs and that this capability carries serious implications.

There appears to be growing recognition that a period of reckoning is likely coming soon, when hackers will be able to use AI to give them more of an advantage over their victims than ever before, even if they disagree on Mythos specifically.

Multiple commentators across the spectrum agree that AI is not as good at fixing bugs and security flaws as it is at finding them, suggesting the challenge is not just discovery but remediation.

Several voices on both sides acknowledge that even if Mythos were never to become public, competitors, including those in China, will release models with comparable hacking ability in the coming months and years.

NPR - How AI is getting better at finding security holes The Hacker News - Anthropic's Claude Mythos Finds Thousands of Zero-Day Flaws Across Major Systems NBC News - Why Anthropic won't release its new Mythos AI model to the public Help Net Security - Anthropic's new AI model finds and exploits zero-days across every major OS and browser

Objective Deep Dive

Anthropric's core claim hinges on the magnitude of Mythos's capability leap. The model found a 27-year-old vulnerability in OpenBSD across roughly 1,000 scaffold runs at a total cost under $20,000, and autonomously identified and fully exploited a 17-year-old remote code execution flaw in FreeBSD granting unauthenticated root access. However, the left-leaning critique is methodologically sound: when AISLE tested the same vulnerabilities with smaller models on isolated code segments, eight out of eight models detected Mythos's flagship FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens. The key difference is that Mythos appears to work autonomously on full codebases, while smaller models required pre-identification of vulnerable segments—a significant but perhaps not insurmountable gap.

Looking forward, all of this is happening on a tight timeline, with frontier AI capabilities likely to advance substantially over just the next few months. The unresolved question is whether Anthropic's restricted-access approach genuinely gives defenders a meaningful head start, or whether competitors, including those in China, will release models with comparable hacking ability within six months to 12 months, making the window of exclusive access irrelevant.

◈ Tone Comparison

The right uses urgent, security-focused language emphasizing unprecedented risk and the need for speed—"urgent attempt to put these capabilities to work for defensive purposes"—while the left employs skeptical language emphasizing corporate power and transparency—"how reliant we are on the judgment of a handful of private actors who aren't accountable to the public". The left frames restrictions as potentially gatekeeping; the right frames them as defensive triage.