Mythos Preview: La IA más peligrosa hackeada en 24 horas

By Yves Laurent | 2026-04-23

On April 7th, Anthropic announced Mythos Preview with the institutional gravity typically reserved for nuclear treaties. Forty handpicked organizations. Apple, Google, Amazon, Goldman Sachs, JP Morgan. Treasury Secretary Scott Bessent convened a meeting of senior bankers in Washington specifically to discuss the model. The justification was straightforward: Mythos is too dangerous for the general public. Project Glasswing would serve as the containment mechanism. The same day as the announcement, a group on a private Discord server was already inside.

They didn't hack anything. They guessed the address and used someone else's key who had access.

The technical sequence is what makes it hard to look away. Legitimate credentials from a contractor at an Anthropic third-party vendor, combined with reconstructing URL naming conventions using data exposed in the Mercor leak three weeks earlier. Basic pattern recognition. The kind of thing anyone with curiosity and spare time can pull off. The phrase that circulated afterward in security circles isn't hyperbole: it's technical description. The AI that finds vulnerabilities in every operating system on the planet was defeated by address bar autocomplete.

What the group did with their access is, in one sense, almost reassuring. They built simple websites. Inconsequential tasks. The stated intention is curiosity, not destruction. I recognize this dynamic from other contexts, and stated intention is exactly the kind of guarantee that guarantees nothing when the tool can produce working exploits autonomously. The same sources indicate the group also has access to unreleased Anthropic models. The question isn't what they did. It's what they could do, and who else, with fewer scruples, has already found the same door.

The real failure isn't where Anthropic says it is. The group didn't evade the company's internal security architecture. They exploited the gap between Anthropic's controls over their own models and those of a third-party vendor with valid access credentials. It's a distinction that matters technically, but from the outside produces exactly the same result. There are researchers who've been saying this in different words for years: any high-capability structure exposed to a semi-distributed environment—with partners, contractors, and trust ecosystems—expands its attack surface beyond what it can realistically control. It's not a new diagnosis. It's the same constant that brought down SolarWinds, compromised Uber, exposed data from millions of people in breaches nobody remembers today. The trust chain is only as strong as its weakest link, and the weakest link is always the one with the least to lose.

The political context turns this into more than a cybersecurity footnote. The breach came one day after Trump said on CNBC that a Pentagon deal was possible and that Anthropic is taking shape. At the same time, Anthropic is suing the Department of Defense for classifying it as a supply chain risk, a legal dispute that revolves precisely around whether the company can control access to what it builds. Unauthorized access, even through a third-party vendor, provides perfect ammunition for those already arguing that Anthropic can't govern its own tools. And then there's the paradox that completes the picture: the Pentagon designated Anthropic as a supply chain risk after the company refused to remove safety guardrails for military use, but the NSA was already using Mythos for vulnerability scanning despite that blacklist. The same government that bans it uses it. The White House pushes to expand access to federal civilian agencies. The right hand doesn't know, or doesn't want to know, what the left is doing.

The unanswered question about Anthropic's business model now finds partial answer. The company with multiple leaks in short order positioned itself as custodian of global critical infrastructure. Nobody audits it effectively enough for even its most restricted model to survive intact on launch day.

Contractors end up being the weak point. Always. Any high-capability organization exposed to semi-distributed environments expands its attack surface beyond what's controllable, and that's not a recent finding. Medieval guilds promised quality standards that their own members systematically violated. Central banks promised oversight that didn't prevent the 2008 crisis. Every control structure promises more protection than it can deliver, and the gap between promise and reality always gets discovered at the worst possible moment. The difference from previous episodes isn't that the model failed. Models always fail eventually. The difference is the scale of what the failed structure was safeguarding. A medieval guild that doesn't meet its standards produces bad cloth. A failing central bank produces a crisis that wipes out millions of people's savings. AI governance that doesn't survive its first day safeguards tools capable of finding vulnerabilities in any operating system on the planet.

I'm still not clear—and I don't think anyone is—what real governance of these tools would look like. Not the institutional theater of meetings with bankers in Washington and lists of forty selected organizations. Real governance, with teeth, with independent auditing, with consequences when it fails. Because the alternative isn't that the world's most dangerous model remains inaccessible forever. The alternative is that the next group that finds it won't be building test websites.

What would effective governance for these capabilities actually look like?

Sources

1. Medium — Report on unauthorized access to Mythos Preview and endpoint URL reconstruction.

2. Nomad Lawyer — Analysis of the Discord group's use of access and reach to unreleased models.

3. Al Jazeera — Coverage of third-party vendor governance failure as access vector.

4. IEEE Spectrum — Security analyst statement on attack surface expansion in trust ecosystems.

5. Aerospace America — Context on Project Glasswing, participating organizations and the meeting convened by Scott Bessent.

Mythos Preview: La IA más peligrosa hackeada en 24 horas

Sources

Comments

Petra