Blog

Claude Mythos — Why the U.S. Treasury and Fed Urgently Summoned Financial Leaders

The COBOL + Copilot + SWIFT three-layer structure is structurally vulnerable to Mythos-class autonomous AI

On April 8, 2026, Fed Chair Jerome Powell and Treasury Secretary Scott Bessent urgently summoned the CEOs of major banks to the Treasury Department headquarters in Washington, D.C. The meeting was arranged in secret and on short notice, timed to coincide with a Financial Services Forum board meeting (reported simultaneously by CNBC, Bloomberg, and other outlets on April 10).

Jane Fraser of Citigroup, Ted Pick of Morgan Stanley, Brian Moynihan of Bank of America, Charlie Scharf of Wells Fargo, David Solomon of Goldman Sachs — the heads of America's most systemically important megabanks were all in attendance. JPMorgan Chase CEO Jamie Dimon was absent, but this was because his firm was already directly involved in threat assessment and defensive measures as an initial partner in Anthropic's defensive initiative, "Project Glasswing."

The Fed Chair and Treasury Secretary convening bank executives together — this was an extraordinary event not seen since the 2008 Lehman crisis.

The agenda was neither interest rates nor inflation. It was the cyber threat that Anthropic's new AI model "Claude Mythos Preview," announced on April 7, posed to the financial system. Financial regulators recognized the risk Mythos represented not as a "technical challenge for IT departments" but as a "systemic risk capable of rendering the entire financial system dysfunctional," and moved to establish a top-down crisis management framework.

What Claude Mythos Proved

Claude Mythos Preview is a model that Anthropic deemed "too dangerous for public release."

It autonomously discovered thousands of zero-day vulnerabilities across every major OS — Windows, macOS, Linux — and every major browser. On the existing cybersecurity benchmark (Cybench), it scored a perfect 100% — the benchmark itself became "saturated" and lost its meaning as a metric. On SWE-bench it scored 93.9% (compared to 80.8% for the previous generation, Claude Opus 4.6), and on USAMO 2026 it hit 97.6% (compared to 42.3%). In a test that involved creating exploits from Firefox crash data, it succeeded 181 times versus two successes for the previous generation — a 90x performance improvement in a single generation.

The specific examples in Anthropic's red team report are staggering. A stack-based buffer overflow (CVE-2026-4747) that had lurked for 17 years in the RPCSEC_GSS authentication protocol implementation of FreeBSD's kernel NFS server. Mythos autonomously read the source code and identified a routine copying packet signature data where up to 304 bytes of data could be written to a 128-byte stack buffer. It didn't stop at finding the bug — it constructed a ROP chain of 20 gadgets and successfully achieved full root access to the system from anywhere on the internet, without authentication. No human expert intervention whatsoever.

In another test, it autonomously constructed a browser attack chaining four vulnerabilities together, completely escaping both the renderer and OS sandboxes. A bug that had lurked in OpenBSD for 27 years, a 16-year-old FFmpeg bug that had survived five million automated fuzz tests — both were uncovered by Mythos's semantic code comprehension.

Anthropic stated: "An AI model has reached the capability to discover and exploit software vulnerabilities that exceeds all but the most skilled humans." And a crucial fact — this cyber capability emerged without any deliberate hacking training. It was a "downstream byproduct" where improvements in fundamental reasoning ability automatically created advanced cyber-attack capabilities.

The economics of cyberattacks have been fundamentally destroyed. Weapons-grade exploits that once required state-sponsored elite hacker teams spending months and millions of dollars to develop can now be generated overnight, for under $50 in compute costs, using a Mythos-class model.

The model has not been publicly released. Under "Project Glasswing," Anthropic established a consortium of 11 initial partners — AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks — and committed up to $100 million in model usage credits and $4 million in direct donations to open-source security organizations. For defense.

But the temporal asymmetry is devastating. The median organizational remediation window — the time between vulnerability discovery and patch deployment — is approximately 70 days across enterprise IT systems. This figure has not improved in years. Meanwhile, the time required for a Mythos-class model to discover and weaponize new zero-day vulnerabilities has been compressed to hours or days.

The Three-Layer Structure of the Financial System — Why It Is Vulnerable

The reason Powell and Bessent gathered bank CEOs is that the current global financial system rests on a structurally vulnerable three-layer architecture.

Layer One: COBOL (The Bedrock)

The backbone of banking core systems worldwide is COBOL, a programming language designed in the 1960s.

43% of the world's core banking systems still run on COBOL infrastructure. When measured by overall reliance on legacy systems, the figure reaches 70%. 95% of ATM transactions and 80% of in-person credit card payments are processed by COBOL. An estimated 220 to 800 billion lines of COBOL code are running in production environments.

COBOL itself is a "deterministic" language specialized for a single purpose — given the same input, it always returns the same result. In that sense, it is robust. But it has a fatal problem. The average age of COBOL programmers exceeds 58, and roughly 10% retire from the workforce each year. Decades of ad hoc patching have swollen codebases to hundreds of millions of lines, creating massive black-box monoliths that no one — not even current maintenance staff — can fully comprehend.

This opacity means that once the system accepts an invalid input, it is extremely difficult for humans to immediately detect and correct the logical error.

Layer Two: Copilot (The Non-Deterministic Interface)

Overlaying this robust but opaque COBOL bedrock is Microsoft 365 Copilot.

Microsoft 365 has more than 450 million commercial seats worldwide, and Copilot is being adopted at the fastest pace of any product in Microsoft's history. Financial institutions are leading the way: Barclays has completed or is rolling out 100,000 seats, UBS 55,000, and BlackRock 24,000 in enterprise-wide deployments.

There is a structural problem in Copilot's design philosophy.

Copilot accesses email (Outlook), files (SharePoint/OneDrive), chat (Teams), and calendars across the board. Because it operates by inheriting the permissions of the legitimate user, security systems classify Copilot's actions as "normal." Firewalls and access controls are powerless before Copilot.

This means a probabilistic, "non-deterministic" LLM (large language model) has been deeply coupled into a system that is supposed to be "deterministic" — always returning the same result for the same input. It is a complete violation of the single-responsibility principle, and the risk of unpredictable bugs and data leakage expands without restraint.

What made this danger real was the EchoLeak vulnerability (CVE-2025-32711). CVSS score 9.3 (out of 10.0) — an extremely critical zero-click prompt injection attack.

Here is how the attack works. An attacker sends the target a seemingly innocuous email containing malicious prompts hidden in invisible HTML comments or white text. The user does not even need to open the email. Later, when the user sends Copilot an everyday query like "Summarize my recent meetings," Copilot's RAG (retrieval-augmented generation) engine loads the email as context and executes the hidden malicious prompt. As a result, Copilot uses the user's legitimate permissions to collect confidential internal data and transmit it to the attacker's external server — all without the user noticing. The logs record it as normal activity.

16% of business-critical enterprise data is overshared, with an average of 802,000 files per organization at risk (Concentric AI 2026 Data Risk Report). The U.S. House of Representatives banned congressional staff from using Copilot due to data safety concerns.

Copilot's convenience features can function as a backdoor for sophisticated AI-driven system compromise. This was not Microsoft's intention. But the result is effectively the same as if Microsoft had built a backdoor into every system worldwide — as a legitimate feature.

Layer Three: SWIFT (The Bloodstream)

On top of this fragile two-layer structure, approximately 50 million international wire transfers are processed each day.

SWIFT (Society for Worldwide Interbank Financial Telecommunication) is a network designed in the 1970s that connects more than 11,000 financial institutions across over 200 countries and processes tens of trillions of dollars in fund transfers daily. It is the bloodstream of global finance.

SWIFT's core network itself is a highly encrypted closed network and is robust. But the true vulnerability lies in the "internal systems of individual banks that connect to SWIFT (the endpoints)." The 2016 theft of $81 million from the Bangladesh Central Bank occurred when human hackers used malware to compromise endpoint credentials and forged legitimate transfer messages.

The modern endpoint environment — where Layer One (an incomprehensible giant COBOL monolith) and Layer Two (Copilot with broad access permissions and susceptibility to external manipulation) overlap — is far more complex and vulnerable than it was at the time of the Bangladesh incident.

What Happens When the Three Layers Converge

Let us lay out what this three-layer structure means.

A "massive legacy codebase no one fully understands (COBOL)" has been tightly coupled with "a probabilistic AI with access privileges reaching deep into the OS (Copilot)," and through that system, "50 million international wire transfers per day (SWIFT)" are being executed.

This is the polar opposite of a loosely coupled design with clear visibility — single-function, single-server simplicity. It is the worst possible anti-pattern.

Consider the scenario when a Mythos-class autonomous AI is deployed as an attacker. The AI hijacks Copilot through a technique like EchoLeak, then instantly searches and extracts internal network diagrams and authentication credentials from SharePoint. It leverages discovered zero-day vulnerabilities to move laterally through the internal network, reaching the SWIFT message generation process.

Once fraudulent transactions are injected into the black-box COBOL core system, the system deterministically processes them as "legitimate instructions" and executes transfers via the SWIFT network to other banks. In today's financial system — where banks are intricately entangled through interbank markets, derivatives trading, and repo transactions — illicit fund outflows from a single node can instantly impair other banks' balance sheets and trigger a liquidity crisis.

While the 2008 Lehman crisis was amplified by delays in human decision-making, an AI-driven "digital bank run" would unfold at machine speed. Before humans can detect the anomaly and shut down systems, the entire bloodstream of global finance could be contaminated.

Monoculture — The Vulnerability of Single-Crop Dependency

At the root of this problem lies global dependence on Microsoft — a monoculture.

Windows, Azure, Microsoft 365, Active Directory, Copilot. Over the past 24 months, more than 1,292 CVEs have been reported in Microsoft products. SolarWinds, the Exchange Server breach, BlueBleed — all consequences of this monoculture.

In February 2026, UK security research institute Eternity Lab pointed out: "This is digital colonialism disguised as convenience. When a single platform is the default for governments and corporations worldwide, vulnerabilities don't fail in isolation — they multiply and spread."

OWASP (Open Worldwide Application Security Project) designated "Cascading Failures (ASI08): A single failure propagates as a system-wide impact across agents, tools, and workflows" as one of the top risks in its 2026 "Top 10 for Agentic Applications." It also warned of "ASI09: Human-Agent Trust Exploitation" — the risk that fluent, confident explanations generated by AI agents deceive human operators into approving dangerous actions.

This follows the same logic as natural farming. A monoculture field can be wiped out by a single disease. An ecosystem without diversity is fragile. Only the coexistence of diverse plants connected through mycorrhizal networks provides resistance to disease. Software architecture operates on the same principle. For an attacker armed with Mythos-class AI, a monoculture environment is the most cost-effective target. An attack technique that succeeds on one system can be reused across the entire world.

Patching Cannot Keep Up

Median time to patch: 70 days. Mythos-class vulnerability discovery and weaponization: hours to days. This asymmetry is structural and cannot be closed through effort alone.

This is why Powell and Bessent gathered the bank CEOs. The problem is not that "vulnerabilities exist." The situation — "riddled with problems, but until now no one had the capability to attack them" — ended the moment Mythos appeared.

Anthropic's decision to withhold public release and distribute first to the defensive side, along with $100 million in usage credits, was the right call. But it is only a matter of time before other state agencies or the open-source community develop models of equivalent capability that fall into the hands of malicious hackers. The stopgap measure of accelerating patch deployment cannot overcome a structural crisis.

The Only Option Is to Change the Structure

The prescription for this problem is not "stronger patches" or "tighter monitoring." The only option is to change the structure itself.

From tight coupling to loose coupling. Break large systems into small, self-contained units. Design so that a failure in one unit does not cascade to others. Each unit must function independently. The current architecture — directly connecting legacy monoliths with non-deterministic AI that has sweeping permissions — creates countless single points of failure.

From monoculture to diversity. Break free from Microsoft dependency. By appropriately combining Linux, Python, and other open-source technologies, prevent systems from becoming black boxes and restore ecological diversity and fault tolerance.

From "Maximum AI" to "Sufficient AI." The philosophy of deeply embedding AI into every business process and dataset for productivity gains expands the attack surface without limit in exchange for convenience, ultimately leading to system self-destruction. AI deployment should be limited to non-deterministic tasks such as data summarization and information retrieval. For core system transaction writes and critical fund transfer approvals, incorporate final human sign-off in physically or logically isolated environments.

Use AI for development. Do not embed it in the product. Do not make it autonomous. Humans decide the structure; AI serves as the tool.

This is the principle for surviving the collapse of monoculture.

If banks stop, pensions go undelivered, hospitals cannot operate, and by the next day you cannot even buy water at a convenience store. When payments stop, society stops.

In April 2026, America's top financial regulators urgently summoned bank CEOs. This is a historic turning point where the evolution of AI technology has transcended the domain of "IT security challenges" and transformed into a systemic risk that shakes the very foundation of the global economy. The existing security paradigm — based on patch-driven defense and monoculture — has reached its end.

A digital bank run and financial collapse unfolding on a timescale of seconds is no longer a theoretical scenario. It is an imminent reality.


References

  • CNBC, "Powell, Bessent discussed Anthropic's Mythos AI cyber threat with major U.S. banks" (2026/4/10)
  • Anthropic Red Team, "Claude Mythos Preview" (2026/4/7), red.anthropic.com
  • Anthropic, "Project Glasswing" (2026/4/7)
  • OWASP, "Top 10 for Agentic Applications" (2026) — ASI08: Cascading Failures, ASI09: Human-Agent Trust Exploitation
  • Trend Micro, "EchoLeak: Preventing Zero-Click AI Threats" (2025/7) — CVE-2025-32711
  • Concentric AI, "Data Risk Report" (2026) — 16% oversharing, 802,000 files per organization
  • Virtru, "The Microsoft Monoculture: A Single Point of Failure" (2025/2) — 1,292 CVEs
  • Eternity Lab, "The world's default productivity tool is becoming a national security risk" (2026/2/23)
  • Phase Change / Rocket Software, COBOL infrastructure statistics
  • CrowdStrike, median patch window: 70 days

Reference

← Prev: New Era — The Forced Transformation of Food and Systems Has Begun Next: Claude Mythos — The Vulnerability of WordPress + AI →

See the Structure

From AI to agriculture — every structural analysis converges on one conclusion.

AISeed — 生物多様性・食料・AIと暮らし(Facebook)