Anthropic AI model discovers security flaws in every major operating system
Anthropic unveiled Project Glasswing on Tuesday, a defensive cybersecurity initiative built around Claude Mythos Preview, restricting access to a model able to find high-severity vulnerabilities, including some in every major operating system and web browser.
Objective Facts
Anthropic announced this week that it had developed a powerful new model the company believes could "reshape cybersecurity," with its latest model, Mythos Preview, able to find "high-severity vulnerabilities, including some in every major operating system and web browser". The model can autonomously identify zero-day vulnerabilities and then construct working exploits across every major operating system and major web browser. Rather than releasing the model publicly, Anthropic unveiled Project Glasswing on Tuesday, a defensive cybersecurity initiative built around Claude Mythos Preview, with partners with controlled access including Amazon Web Services, Apple, Cisco, Google and Microsoft. Senator Mark Warner, vice chairman of the Senate Intelligence Committee, called on industry to accelerate patching as AI speeds up vulnerability discovery. The announcement arrives amid significant government concern: Treasury Secretary Scott Bessent and Jerome Powell met with CEOs of several U.S. banks to warn them about cybersecurity risks that Mythos Preview poses. Complicating matters, the Pentagon designated Anthropic a supply chain risk earlier this year because the company had declined to ease restrictions on domestic surveillance tools and autonomous weapons for Pentagon use.
Left-Leaning Perspective
Some security experts and software developers committed to open-source software argue the world would be safer if Mythos were released so that every defender, not just Anthropic's chosen partners, could use it to find and patch vulnerabilities. Jonathan Iwry of the Wharton Accountable AI Lab articulated the core concern: "Whatever the right judgment call is, the most striking aspect of this situation is how reliant we are on the judgment of a handful of private actors who aren't accountable to the public". This reflects a broader concern about corporate gatekeeping of critical infrastructure. Critics also question the authenticity of Anthropic's safety claims. Heidy Khlaaf, chief AI scientist at the AI Now Institute, said Anthropic's detailed blog post explaining the new vulnerabilities left out many key details needed to verify its claims, and warned against "taking these claims at face value" without more information. Frontier labs have been taking a harder line on distillation this year, with Anthropic publicly revealing attempts by Chinese firms to copy its models; blocking distillation eliminates advantages from using huge amounts of capital to scale, and the selective release approach gives labs a way to differentiate their enterprise offerings as the category becomes key to profitable deployment. If frontier labs routinely restrict models citing vague security concerns, it could stifle independent research and concentrate AI capabilities among a handful of well-funded companies, with academic researchers and smaller organizations losing access to cutting-edge tools.
Right-Leaning Perspective
Anthropic says the model is too dangerous to release to the general public; Newton Cheng, Frontier Red Team Cyber Lead at Anthropic, stated "We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities," warning that "given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout – for economies, public safety, and national security – could be severe". This position has official government backing: Vice President JD Vance and Treasury Secretary Scott Bessent last week questioned leading tech CEOs about AI security, and Bessent and Federal Reserve Chair Jerome Powell this week called a surprise meeting with the heads of the biggest U.S. banks to address the potential threat of Mythos. Defense and intelligence community voices support the restricted approach. Anthropic's defensive coalition approach suggests the company believes the model's offensive cyber potential is materially different from ordinary model-release risk; the challenge is not an abstract future alignment problem but a present-tense capability-diffusion problem—the question is who gets advanced cyber capability first and whether defenders can get a head start before similar systems spread more widely. Corporate partners echo this rationale: Microsoft stated that "we are entering a phase where cybersecurity is no longer bound by purely human capacity, and we look forward to partnering with Anthropic and the broader industry to evaluate emerging models, validate their effectiveness, encourage responsible use, and improve security outcomes for all". However, there is a complication: Anthropic's existing Pentagon conflict undermines its credibility on national security matters. The announcement comes as Anthropic and the Pentagon are in a legal standoff after the US Department of Defence labelled the company a supply chain risk in February over Anthropic's refusal to allow the use of its AI, Claude, in autonomous weapons and mass surveillance.
Deep Dive
In the past few months, AI models have become increasingly effective at finding security flaws in software; just three months into 2026, the cURL team has found and fixed more vulnerabilities than each of the previous two years. This broader context shows that vulnerability discovery acceleration is real and systemic, not unique to Mythos. Automated vulnerability discovery tools have existed for decades, and the gap between finding a bug and building a working exploit has always slowed attackers, but that gap is now substantially narrower. Anthropric's core claim hinges on the magnitude of Mythos's capability leap. The model found a 27-year-old vulnerability in OpenBSD across roughly 1,000 scaffold runs at a total cost under $20,000, and autonomously identified and fully exploited a 17-year-old remote code execution flaw in FreeBSD granting unauthenticated root access. However, the left-leaning critique is methodologically sound: when AISLE tested the same vulnerabilities with smaller models on isolated code segments, eight out of eight models detected Mythos's flagship FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens. The key difference is that Mythos appears to work autonomously on full codebases, while smaller models required pre-identification of vulnerable segments—a significant but perhaps not insurmountable gap. Looking forward, all of this is happening on a tight timeline, with frontier AI capabilities likely to advance substantially over just the next few months. The unresolved question is whether Anthropic's restricted-access approach genuinely gives defenders a meaningful head start, or whether competitors, including those in China, will release models with comparable hacking ability within six months to 12 months, making the window of exclusive access irrelevant.