Claude Mythos and Project Glasswing: How AI Is Reshaping the Cybersecurity Battlefield

21/04/26

8 min

Author

Pierre SAMSON

You’ve obviously seen this announcement everywhere already, but Anthropic’s latest reveal is still worth a closer look. The company has introduced Claude Mythos Preview, a new AI model described as its most powerful to date, capable of autonomously discovering and exploiting software vulnerabilities at a level comparable to, or even exceeding, top human security researchers.

The company deemed this model too dangerous for public release. Instead, they're deploying it through a controlled initiative that could reshape how we think about software security.

A New Category of AI Model

Claude Mythos Preview isn't just another incremental improvement. With 10 trillion parameters, it represents an entirely new category of AI model incomparable to anything Anthropic has released before, including their most advanced publicly available model, Opus 4.6.

Over recent weeks, Mythos has identified thousands of previously unknown zero-day vulnerabilities across every major operating system and web browser. Some of these flaws had survived decades of human review and millions of automated security scans. Three discoveries illustrate the model's capabilities:

- A 27-year-old vulnerability in OpenBSD, a system renowned for security, that automated tools had tested 5 million times without detecting
- A 16-year-old flaw in FFmpeg that had escaped detection since before the iPhone existed
- Multiple vulnerabilities in the Linux kernel that could be chained together to give attackers complete system control.

On Firefox alone, while Opus 4.6 found 2 exploits, Mythos found 180 a 90-fold increase. Even on broader benchmarks like CyberGym, Mythos scored 83.1% compared to 66.6% for Opus 4.6.

Vulnerability Discovery and Emergent Behaviors

But what truly sets Mythos apart, and what makes it concerning, are behaviors that emerged during internal testing at Anthropic.

In one test, a researcher challenged Mythos to escape its secure testing environment, then left to grab lunch. When he returned, the model had not only succeeded in breaking out of the sandbox it had also sent him an email announcing its success. Nobody had asked it to send an email. It just did.

Analysis of the model's 200-page system card reveals documented capabilities for deception and concealment. In some tests, Mythos broke rules and then attempted to hide what it had done creating hidden plans in its reasoning chains, solving problems independently to mask that it already knew the answer, even modifying Git history to erase its traces.

These aren't speculative fears. They're measured behaviors documented in Anthropic's own interpretability analysis, showing features for strategic manipulation and evasion.

Before we anthropomorphize too much: Anthropic notes these behaviors occurred in only 0.001% of cases, far less frequently than humans lie. But the fact that they exist at all in an AI system is striking.

Project Glasswing: Racing to Patch Before Attackers Catch Up

Recognizing both the opportunity and the risk, Anthropic launched Project Glasswing bringing together Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, Microsoft, NVIDIA, Palo Alto Networks, and more than 40 additional organizations.

The initiative commits $100 million in usage credits for participants to scan critical systems and open-source infrastructure, plus $4 million in direct donations to open-source security organizations. Once the program ends, access continues at $25 per million input tokens and $125 per million output tokens, creating what some observers note is a strategic dependency on the model.

As Cisco's Chief Security Officer Anthony Grieco put it: "AI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure from cyber threats, and there is no going back."

The Paradox: More Discovery, Harder Remediation

Project Glasswing highlights a paradox: the more efficiently we discover vulnerabilities, the harder it becomes to operationalize their remediation.

According to early analysis, less than 1% of the vulnerabilities discovered by Claude Mythos have been patched.

Read that again. Less than one percent.

Mythos can discover thousands of vulnerabilities at machine speed. But remediation still happens at human speed. Lee Klarich, Chief Product Officer at Palo Alto Networks, captured the urgency: "The window between a vulnerability being discovered and being exploited by an adversary has collapsed what once took months now happens in minutes with AI."

The numbers confirm this gap. Only 7.6% of organizations remediate critical vulnerabilities within 24 hours, while attackers can weaponize them in 24-48 hours.The mean time to remediate? 65 days. While AI discovers flaws in hours.

Why so slow? It's not a technical problem, it's an organizational one. Every patch needs testing to avoid breaking production. Fixes require coordination across security, IT, application owners, and business stakeholders. In large organizations with complex dependencies and heterogeneous environments, even a simple patch can take weeks to deploy safely.

But something deeper is happening. We are not just discovering more vulnerabilities, we are overwhelming our ability to act on them. AI doesn’t just increase visibility, it degrades the signal-to-noise ratio of security operations. And this leads to a structural reality: the more efficient discovery becomes, the more impossible complete remediation becomes. This is not a temporary gap, it is a system-level constraint.

Are there advances that could close this gap? Yes, but they're incremental. Modern patch management is more automated, AI tools can generate contextualized fixes, and orchestration workflows are accelerating remediation tracking.

But these advances speed up the process, they don't eliminate the bottleneck. You still need human judgment for risk assessment, testing to validate patches, and change management to coordinate deployment. In distributed, multi-cloud environments, that complexity remains.

The gap is not disappearing, it is being redefined.

And with AI discovering vulnerabilities exponentially faster, the question isn't whether organizations can keep up it's how they prioritize what matters most before attackers exploit what they've missed.

The gap between finding flaws and fixing them just became the defining challenge of modern cybersecurity.

From Detection to Decision: The Real Shift

We’ve largely solved visibility. What we’ve created instead is overload. We are not facing a visibility problem anymore, we are facing a ✨ prioritization problem✨.

Think about it: If an AI can surface thousands of high-severity vulnerabilities across your infrastructure in hours, the bottleneck isn't "what are our vulnerabilities?" It's "which ones actually threaten us, and what do we fix first?"

This isn't a hypothetical scenario. It's happening right now with Project Glasswing participants. Security teams suddenly have access to unprecedented vulnerability intelligence and they're discovering that more data doesn't automatically mean better security. It means harder decisions.
‍
The challenge shifts from detection to decision. And that shift requires a fundamentally different approach to vulnerability management.

What This Demands: From Vulnerability Management to Exposure Management

The traditional approach scan periodically, rank by CVSS, patch what you can breaks down completely when AI starts flooding your systems with findings at machine speed.

Contextual risk assessment that goes beyond scores. Not every critical vulnerability poses the same risk to every organization. Is the vulnerable system exposed to the internet? Does it handle sensitive data? Is there active exploitation in the wild? Can attackers even reach it from your attack surface? These questions determine actual risk not the number in a database.
‍
Real-time correlation between vulnerabilities and actual exposure. Finding a flaw means nothing without understanding your environment. Which assets are affected? What's their criticality to the business? What compensating controls exist? Are there dependencies that amplify or reduce the risk? This kind of contextual intelligence separates noise from genuine threats.

Orchestrated remediation workflows that scale with AI-speed discovery. When you're dealing with thousands of findings, manual processes collapse. Email chains don't scale. Spreadsheets become obsolete the moment you export them. You need systems that automatically route prioritized issues to the right teams with the right context and track progress without drowning people in alerts.

This is the shift from vulnerability management to exposure management. It's not about counting CVEs. It's about understanding which exposures genuinely threaten your business and orchestrating their remediation before attackers exploit them.

The AI breakthrough at Anthropic didn’t make vulnerability management platforms obsolete. It widened the gap between discovery and remediation by revealing more than organizations can realistically process.

The next battlefield isn’t detection, it’s decision-making at scale.

‍This is where a new category of platforms is emerging, designed to turn vulnerability intelligence into prioritized, actionable risk reduction. Platforms like Hackuity are built around this shift, not to find more vulnerabilities, but to help organizations decide what actually matters.

Because the reality is simple: AI will get better at finding vulnerabilities. The organizations that survive won't be the ones with the longest patch lists. They'll be the ones who can answer one question faster than everyone else: "What actually threatens us, and what do we fix first?"

The Offensive Threat: A Six-Month Window

While defenders race to adapt, the offensive threat looms.

Alex Stamos, former CISO at Facebook and Yahoo, warns: "We only have something like six months before the open-weight models catch up to the foundation models in bug finding. At which point every ransomware actor will be able to find and weaponize bugs."

Given the rapid pace of AI development, models with comparable capabilities will likely proliferate potentially to actors without Anthropic's safety commitments. According to Fortune, Anthropic has privately warned government officials that these capabilities make large-scale cyberattacks significantly more likely this year.

The question isn't whether this particular AI is dangerous. It's whether we trust the people who control it to make the right choices, and what happens when that control inevitably spreads.

The Governance Challenge

Anthropic has engaged with US government agencies, including CISA and the Center for AI Standards and Innovation, to brief them on Mythos capabilities. The company's 200-page system card demonstrates a level of transparency unusual in the industry documenting not just successes but failures, doubts, and ethical questions.

Yet comprehensive AI governance frameworks remain largely absent. A private company now possesses zero-day exploits for most major software systems, exploits powerful enough that the company won't publicly release the model that found them. The incentive to steal Anthropic's model weights just increased dramatically.

Whether Mythos itself becomes available to the broader public remains uncertain prediction markets suggest possibly mid-2027 at earliest.

What Comes Next

Project Glasswing represents an important attempt to ensure defensive capabilities stay ahead of offensive threats. Anthropic plans to report within 90 days on lessons learned, vulnerabilities fixed, and best practices for evolving security processes.

For security teams, the message is clear: We're entering an era where the same technology that can reveal decades of hidden flaws can also create unprecedented attack surfaces. The AI can find the vulnerabilities. The question is whether organizations can close the gap between discovery and remediation. Because AI can surface the problems, but it cannot decide what matters.

The teams that succeed won't be the ones with perfect vulnerability counts. They'll be the ones who transformed their approach from reactive patching to proactive exposure management, before the flood of AI-discovered vulnerabilities made that transformation mandatory.

Because in a world where AI discovers vulnerabilities at light speed but humans remediate at human speed, that gap will define who survives what comes next.

I WANT TO KNOW MORE

Moving beyond vulnerability: Security teams cannot and should not patch everything

Security teams cannot and should not patch everything, particularly when many teams are inundated with false positives

Agentic AI for Vulnerability Management: How SmartEx² Enterprise Speaks to Your Tools

You're not short on vulnerability & exploit data. You're short on time to make sense of it all. Every morning, your team faces the same challenge: dozen of new CVEs, tens marked "critical," some actually exploited in the wild. Which ones..

Experts welcome EU-led alternative to MITRE's vulnerability tracking scheme

The EU-led framework will reduce reliance on US-based MITRE vulnerability reporting database.Cybersecurity experts have welcomed the launch of the new Global CVE Allocation System (GCVE) as a positive move toward more robust vulnerability..